Fairsterdam Please click here to watch our introductory video about our product Fairsterdam.

1. Introduction

1.1 What’s the use case?

Amsterdam is one of the earliest and largest market for Airbnb. Though the tourism industry and short-term/holiday rental helped Amsterdam to renovate and revitalize the historical district, the unregulated expansion brought serious problems to the local communities. By pushing out local businesses, driving up rental prices for short-term and long-term rental prices, bringing too much strangers and impolite tourists into residential neighborhoods, Airbnb is influencing the city and especially the communities in a negative way. With rising concerns Airbnb, the current public health condistion is seen by some cities as the opportunity to make a turning point for the Airbnb expansion.

We are developing an exploratory machine learning model to predict the annual revenue of a new Airbnb listing. The dependent variable Annual revenues are calculated by aggregating the price and occupant time for the whole year.

Our grander goal is to develop a integrated system which not only provide information about the predicted tax revenue, but also feedback and opinions about the new listing from community members. Our prediction about tax revenue and monthly occupancy will be sent to the community members to inform them about the activities and amenities that can be supported by the economic gains. Being fully informed about the pros and cons, they will report their opinions back to the government. Ideally, their opinions will be taken into consideration in the decision making process. But since we don’t have actual survey data from the community, this report will only focus on the algorithm of annual revenue prediction.

1.2 Why would someone replicate this?

This report is constituted of two parts: the first part is an exploratory analysis of the features that influence Airbnb revenue, which is directly related to price and occupancy; the second part is a machine learning model based on the correlated features to predict the annual revenue, this model is tested and validated in different ways.

As Airbnb is a large business that affects cities globally, this research method and algorithm can be used to explore Airbnb in other cities and regions. Heavily relying on tourism, Airbnb in different cities have a lot in common. But as they are also contextualized in local culture, regulations, spatial and business traditions, the analysis will need to be adapted to fit in.

1.3 Why this approach?

In the real estate market, to price a new lease and predict about the vacancy rate, the operator or the consultant searches for comparable properties, which are generally locating near the new project, have similar features and target markets. But usually this market analysis is limited to a small data set. Also, to weigh different features about the comparability, the use common parameters for different properties. This parameter are often set in a range and the exact value was chosen subjectively. Since we have a large data set, we can be more precise about the coefficients thus the prediction.

Our model is still based on the hedonic model, but more features are included through data exploration. In the published Airbnb prediction researches online, most of them only predicted price and included limited features. (Click here to see an example) Since we are predicting revenue, which is the actual benefit, our model took price and occupancy into consideration. Though our prediction has larger errors than predictions on price alone (since occupancy is more volatile and has fewer connections with physical features), we believe it is still valuable. Also, we are including more independent variables. Apart from basic features, we include 1) public amenities and tourist attractions, 2) the spatial lag of price, calculating the average revenue of the nearest five listings, 3)complementary features extracted from names and descriptions.

Our model is far from perfect, but as a public algorithm based on public data set, we believe it will help the public to understand more about the benefits and costs of Airbnb.

1.4 Used data

To predict the annual revenue, our model will be based on three aspects listed below. The hedonic model features the internal and external features about the apartments’ physical condition and spatial relationship with amenities and POIs; the time factor accounts for the seasonality of tourism; text are used to see how hosts are advertising their properties, and what descriptions might be adding value.

  • The Hedonic Model

Physical Characteristics: basic features such as room number, room type, amenities
Spatial Processing: Neighborhood Effect, Spatial Lag
Spatial Features: Distance to transit, supermarkets, tourists’ attractions, city center

  • Time Factor

Seasonality: monthly differences of price and occupancy

  • Text

Names: Key features advertised
Descriptions: Features not formally listed that are adding value to the listing

Our data source include:

  1. Airbnb Data of Amsterdam
    This data set provide information about Airbnb listings, including price, physical features of the apartment, and location. It also have a calender dataframe, which includes to date data about the price and occupant status.

  2. Amsterdam Open Data (Maps)
    This is the open data provided by Amsterdam government, features such as boundaries of neighborhood, UNESCO zone and public amenities.

  3. OpenStreetMap Data downloaded with package “osmdata”
    We use point data from OpenStreetMap about convenient stores, shopping malls, supermarkets and other amenity data that tourists care about.

  4. Tourists Attractions and POIs
    This is a public data set provided by Tourpedia. We used data of POIs and tourist attractions in Amsterdam.

Since the data size is quite large, we suggest you to download all the data in advance. All the data can be downloaded HERE.

2. Setup

2.1 Load R packages

library(tidyverse)
library(sf)
library(RSocrata)
library(viridis)
library(spatstat)
library(ggplot2)
library(raster)
library(spdep)
library(FNN)
library(mapview)
library(grid)
library(gridExtra)
library(knitr)
library(stringr)
library(kableExtra)
library(tidycensus)
library(lubridate)
library(viridis)
library(stargazer)


library(scales)
library(RColorBrewer)
library(gridExtra)
library(ggthemes)
library(readr)
library(ggcorrplot)
library(caret)

library(sjPlot)
library(sjmisc)
library(sjlabelled)
library(osmdata)

#text mining
library(tm)
library(wordcloud2)
library(SnowballC)
options(scipen=999)

2.2 Standardize the formatting

palette5 <- c("#E46B45","#BD665C","#966174","#6E5C8B","#4757A2")
palette4 <- c("#E46B45","#BD665C","#6E5C8B","#4757A2")
palette2 <- c("#e46b45","#4757a2")

qBr <- function(df, variable, rnd) {
  if (missing(rnd)) {
    as.character(quantile(round(df[[variable]],0),
                          c(.01,.2,.4,.6,.8), na.rm=T))
  } else if (rnd == FALSE | rnd == F) {
    as.character(formatC(quantile(df[[variable]]), digits = 3),
                 c(.01,.2,.4,.6,.8), na.rm=T)
  }
}

qBr2 <- function(df, variable, rnd) {
  if (missing(rnd)) {
    as.character(round(quantile(round(df[[variable]],0),
                          c(.01,.2,.4,.6,.8), na.rm=T)))
  } else if (rnd == FALSE | rnd == F) {
    as.character(round(formatC(quantile(round(df[[variable]]), 0)),
                 c(.01,.2,.4,.6,.8), na.rm=T))
  }
}

q5 <- function(variable) {as.factor(ntile(variable, 5))}

plotTheme <- function(base_size = 12) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = 14,colour = "black"),
    plot.subtitle = element_text(face="italic"),
    plot.caption = element_text(hjust=0),
    axis.ticks = element_blank(),
    panel.background = element_blank(),
    panel.grid.major = element_line("grey80", size = 0.1),
    panel.grid.minor = element_blank(),
    panel.border = element_rect(colour = "black", fill=NA, size=2),
    strip.background = element_rect(fill = "grey80", color = "white"),
    strip.text = element_text(size=12),
    axis.title = element_text(size=12),
    axis.text = element_text(size=10),
    plot.background = element_blank(),
    legend.background = element_blank(),
    legend.title = element_text(colour = "black", face = "italic"),
    legend.text = element_text(colour = "black", face = "italic"),
    strip.text.x = element_text(size = 14)
  )
}

mapTheme <- theme(plot.title =element_text(size=12),
                  plot.subtitle = element_text(size=8),
                  plot.caption = element_text(size = 8),
                  axis.line=element_blank(),
                  axis.text.x=element_blank(),
                  axis.text.y=element_blank(),
                  axis.ticks=element_blank(),
                  axis.title.x=element_blank(),
                  axis.title.y=element_blank(),
                  panel.background=element_blank(),
                  panel.border = element_rect(colour = "black", fill=NA, size=2),
                  panel.grid.major=element_line(colour = 'grey92'),
                  panel.grid.minor=element_blank(),
                  legend.direction = "vertical", 
                  legend.position = "right",
                  plot.margin = margin(1, 1, 1, 1, 'cm'),
                  legend.key.height = unit(1, "cm"), legend.key.width = unit(0.2, "cm"))

2.3 Function

nn_function <- function(measureFrom,measureTo,k) {
  measureFrom_Matrix <- as.matrix(measureFrom)
  measureTo_Matrix <- as.matrix(measureTo)
  nn <-   
    get.knnx(measureTo, measureFrom, k)$nn.dist
  output <-
    as.data.frame(nn) %>%
    rownames_to_column(var = "thisPoint") %>%
    gather(points, point_distance, V1:ncol(.)) %>%
    arrange(as.numeric(thisPoint)) %>%
    group_by(thisPoint) %>%
    summarize(pointDistance = mean(point_distance)) %>%
    arrange(as.numeric(thisPoint)) %>% 
    dplyr::select(-thisPoint) %>%
    pull()
  
  return(output)  
}


rquery.wordcloud <- function(x, type=c("text", "url", "file"), 
                          lang="english", excludeWords=NULL, 
                          textStemming=FALSE,  colorPalette="Dark2",
                          min.freq=3, max.words=200)
{ 
  library("tm")
  library("SnowballC")
  library("wordcloud")
  library("RColorBrewer") 
  
  if(type[1]=="file") text <- readLines(x)
  else if(type[1]=="url") text <- html_to_text(x)
  else if(type[1]=="text") text <- x
  
  # Load the text as a corpus
  docs <- Corpus(VectorSource(text))
  # Convert the text to lower case
  docs <- tm_map(docs, content_transformer(tolower))
  # Remove numbers
  docs <- tm_map(docs, removeNumbers)
  # Remove stopwords for the language 
  docs <- tm_map(docs, removeWords, stopwords(lang))
  # Remove punctuations
  docs <- tm_map(docs, removePunctuation)
  # Eliminate extra white spaces
  docs <- tm_map(docs, stripWhitespace)
  # Remove your own stopwords
  if(!is.null(excludeWords)) 
    docs <- tm_map(docs, removeWords, excludeWords) 
  # Text stemming
  if(textStemming) docs <- tm_map(docs, stemDocument)
  # Create term-document matrix
  tdm <- TermDocumentMatrix(docs)
  m <- as.matrix(tdm)
  v <- sort(rowSums(m),decreasing=TRUE)
  d <- data.frame(word = names(v),freq=v)
  # check the color palette name 
  if(!colorPalette %in% rownames(brewer.pal.info)) colors = colorPalette
  else colors = brewer.pal(8, colorPalette) 
  # Plot the word cloud
  set.seed(1234)
  wordcloud(d$word,d$freq, min.freq=min.freq, max.words=max.words,
            random.order=FALSE, rot.per=0.35, 
            use.r.layout=FALSE, colors=colors)
  
  invisible(list(tdm=tdm, freqTable = d))
}
#++++++++++++++++++++++
# Helper function
#++++++++++++++++++++++
# Download and parse webpage
html_to_text<-function(url){
  library(RCurl)
  library(XML)
  # download html
  html.doc <- getURL(url)  
  #convert to plain text
  doc = htmlParse(html.doc, asText=TRUE)
 # "//text()" returns all text outside of HTML tags.
 # We also don’t want text such as style and script codes
  text <- xpathSApply(doc, "//text()[not(ancestor::script)][not(ancestor::style)][not(ancestor::noscript)][not(ancestor::form)]", xmlValue)
  # Format text vector into one character string
  return(paste(text, collapse = " "))
}

3. Data Wrangling

3.1 Load data

setwd("D:/MUSA 508/Final Project")
#setwd("D:/Rdata/Final_Airbnb/Data")
listings <- st_read("listings.csv")
## Reading layer `listings' from data source `D:\MUSA 508\Final Project\listings.csv' using driver `CSV'
details <- st_read("listings_details.csv")
## Reading layer `listings_details' from data source `D:\MUSA 508\Final Project\listings_details.csv' using driver `CSV'
calendar <- read.csv("calendar.csv")



#large scale neighborhood
neighborhood <- st_read('neighbourhoods.geojson')
## Reading layer `neighbourhoods' from data source `D:\MUSA 508\Final Project\neighbourhoods.geojson' using driver `GeoJSON'
## Simple feature collection with 22 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XYZ
## bbox:           xmin: 4.754837 ymin: 52.27817 xmax: 5.079162 ymax: 52.43068
## z_range:        zmin: 42.88058 zmax: 43.14972
## geographic CRS: WGS 84
#small scale neighborhood (used to define community?)
neighbor2 <- st_read('neighbor2.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `neighbor2' from data source `D:\MUSA 508\Final Project\neighbor2.json' using driver `GeoJSON'
## Simple feature collection with 481 features and 5 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: 4.728773 ymin: 52.27816 xmax: 5.079169 ymax: 52.43105
## geographic CRS: WGS 84
developing_area <- st_read('developing_area.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `developing_area' from data source `D:\MUSA 508\Final Project\developing_area.json' using driver `GeoJSON'
## Simple feature collection with 19 features and 5 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: 4.776982 ymin: 52.28436 xmax: 4.993388 ymax: 52.42163
## geographic CRS: WGS 84
crowdsensor <- st_read('crowdsensor.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `crowdsensor' from data source `D:\MUSA 508\Final Project\crowdsensor.json' using driver `GeoJSON'
## Simple feature collection with 107 features and 6 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 4.855762 ymin: 52.31185 xmax: 4.972232 ymax: 52.39214
## geographic CRS: WGS 84
metro <- st_read('tram_metro_stops.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `tram_metro_stops' from data source `D:\MUSA 508\Final Project\tram_metro_stops.json' using driver `GeoJSON'
## Simple feature collection with 224 features and 6 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 4.77478 ymin: 52.29561 xmax: 5.004306 ymax: 52.40187
## geographic CRS: WGS 84
buildingyear <- st_read('buildingyearblock.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `buildingyearblock' from data source `D:\MUSA 508\Final Project\buildingyearblock.json' using driver `GeoJSON'
## Simple feature collection with 22072 features and 1 field
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 4.734726 ymin: 52.27854 xmax: 5.062233 ymax: 52.43045
## geographic CRS: WGS 84
zipcode6 <- st_read('zipcode6.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `zipcode6' from data source `D:\MUSA 508\Final Project\zipcode6.json' using driver `GeoJSON'
## Simple feature collection with 18280 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 4.735438 ymin: 52.27845 xmax: 5.062233 ymax: 52.43045
## geographic CRS: WGS 84
zipcode4 <- st_read('zipcode4.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `zipcode4' from data source `D:\MUSA 508\Final Project\zipcode4.json' using driver `GeoJSON'
## Simple feature collection with 81 features and 2 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 4.728773 ymin: 52.27816 xmax: 5.079169 ymax: 52.43105
## geographic CRS: WGS 84
details.sf <- st_as_sf(details,coords = c('longitude','latitude'),crs = 4326) %>% 
  st_transform(st_crs(neighborhood))

details.sf$price <- parse_number(details.sf$price)
details.sf$weekly_price <- parse_number(details.sf$weekly_price)
details.sf$monthly_price <- parse_number(details.sf$monthly_price)
details.sf$cleaning_fee <- parse_number(details.sf$cleaning_fee)
details.sf$extra_people <- parse_number(details.sf$extra_people)
details.sf$security_deposit <- parse_number(details.sf$security_deposit)
details.sf$beds <- as.numeric(details.sf$beds)
details.sf$minimum_nights <- as.numeric(details.sf$minimum_nights)
details.sf$maximum_nights <- as.numeric(details.sf$maximum_nights)
details.sf$number_of_reviews <- as.numeric(details.sf$number_of_reviews)
details.sf$review_scores_rating <- as.numeric(details.sf$review_scores_rating)
details.sf$review_scores_accuracy <- as.numeric(details.sf$review_scores_accuracy)
details.sf$review_scores_cleanliness<- as.numeric(details.sf$review_scores_cleanliness)
details.sf$review_scores_value <- as.numeric(details.sf$review_scores_value)
details.sf$reviews_per_month <- as.numeric(details.sf$reviews_per_month)

details.sf.raw <- details.sf

3.2 Deal with data

3.2.1 Price panel

# Available Calendar
available_calendar <- calendar %>%
  filter(available =="t")

available_calendar$listing_id <- as.character(available_calendar$listing_id)

# Price change
available_calendar <- available_calendar%>%
  mutate(price2 = gsub("^.","",price))%>%
  mutate(price2 = gsub(",","",price2))

available_calendar$price3 <- as.numeric(available_calendar$price2)

# Sum per month
available_calendar2 <- available_calendar%>%
  mutate(date2 = ymd(date))%>%
  mutate(month = month(date2))%>%
  group_by(listing_id, month) %>%
  summarize(month_price = mean(price3))
length(unique(calendar$listing_id))
## [1] 20030
length(unique(calendar$listing_id))*12
## [1] 240360
study.panel <- 
  expand.grid(listing_id = unique(calendar$listing_id),
              month=unique(available_calendar2$month))

study.panel$listing_id <- as.character(study.panel$listing_id)

listing_panel <- study.panel %>%
  left_join(available_calendar2) %>%
  mutate(each_month_price = 0)

o <- order(listing_panel[,"listing_id"],listing_panel[,"month"])
listing_panel <- listing_panel[o,]

xx <- 0

for(i in 2:nrow(listing_panel)){
  if(!is.na(listing_panel[i,3])){
    xx <- listing_panel[i,3]
    listing_panel[i,4]=listing_panel[i,3]
  }
  if(is.na(listing_panel[i,3])& listing_panel[i-1,1]==listing_panel[i,1]){
    listing_panel[i,4] <- xx
  }
  if(is.na(listing_panel[i,3])& listing_panel[i-1,1]!=listing_panel[i,1]){
    xx <- 0
    listing_panel[i,4] <- xx
  }
}

d <- order(listing_panel[,"listing_id"],-listing_panel[,"month"])
listing_panel <- listing_panel[d,]

for(i in 2:nrow(listing_panel)){
  if(listing_panel[i,4]==0){
    listing_panel[i,4]=listing_panel[i-1,4]
  }
}

o <- order(listing_panel[,"listing_id"],listing_panel[,"month"])
listing_panel <- listing_panel[o,]

listing_0price <- listing_panel %>%
  filter(each_month_price == 0)

no_price <- unique(listing_0price$listing_id)
no_price
##  [1] "10002942" "10003068" "10003546" "10003576" "10003943" "10004383"
##  [7] "10004452" "10004732" "10004773" "10004838" "10004961" "10005851"

Drop listings that have no price

listing_panel <- listing_panel %>%
  filter(!listing_id %in% no_price)

3.2.2 Occupancy panel

index <- function(x, flag = '0') {
  digit <- floor(log10(length(x))) + 1
  paste(flag, formatC(x, width = digit, flag = '0'), sep = '')
}

occupancy <- calendar%>%
  mutate(date2 = ymd(date))%>%
  mutate(month = month(date2),
         count = ifelse(available == "f", 1, 0),
         listing_id = as.character(listing_id))%>%
  filter(!listing_id %in% no_price) %>%
  group_by(listing_id, month) %>%
  summarize(monthly_occupancy = sum(count))

monthly_occupancy <- occupancy %>%
  mutate(month = as.character(month))%>%
  mutate(month = ifelse(month != 10 & month != 11 & month != 12, gsub("^","0",month), month)) %>%
  group_by(month) %>%
  summarise(mean_monthly_occupancy = mean(monthly_occupancy))

ggplot(monthly_occupancy, 
       aes(x=month, y=mean_monthly_occupancy, group =1)) +
  geom_line(size=1, color = "#e46b45")+
  plotTheme()

From the figure above, we see that the occupancy reaches its peak during summer, especially in July. Though occupancy in February seems much lower than others, that is probably because the number of days in February is fewer. Overall, occupancy is a bit higher during summer time and a bit lower during winter time.

Month_price <- calendar%>%
  mutate(date2 = ymd(date))%>%
  mutate(month = month(date2),
         listing_id = as.character(listing_id),
         price = parse_number(price))%>%
  mutate(month = as.character(month))%>%
  mutate(month = ifelse(month != 10 & month != 11 & month != 12, gsub("^","0",month), month)) %>%
  drop_na(price) %>%
  group_by(month) %>%
  summarize(mean_monthly_price = mean(price))


ggplot(Month_price, 
       aes(x=month, y=mean_monthly_price, group =1)) +
  geom_line(size=1, color = "#e46b45")+
  plotTheme()

From the figure above, we see that the price reaches its peak in April. The average price in February is lower than others. Overall, the fluctuation of prices throughout year shows the same trend as occupancy, which indicates that these two variables might be correlated.

3.3 Basic Visualization

Plot the listings’ prices as points on the map

details.sf <- 
  details.sf %>% 
  mutate(priceBed=price/beds) %>% 
  filter(priceBed <=1000)


ggplot()+
  geom_sf(data = neighborhood,fill='grey90',color = 'white')+
  geom_sf(data = details.sf, aes(colour=q5(priceBed)),size=.5)+
  scale_color_manual(values = palette5,
                     labels = qBr(details.sf,'priceBed'),)+
  labs(title = "Price per bed",
       subtitle = 'Amsterdam Airbnb, price on 2018-12-6')+
  mapTheme

From the figure above, we see that the distribution of Airbnb housings cluster at the center of the city. Houses with high price also cluster at the center while those with lower price are dispersed at the outskirt.

Plot the number of airbnb by neighborhood2

listings.sf<- listings %>% 
  st_as_sf(coords = c( "longitude","latitude"), crs = 4326, agr = "constant")

neighbor2 <- neighbor2 %>% 
  st_transform(st_crs(listings.sf))

listing.sf.neighbor2 <- st_intersection(listings.sf,neighbor2) %>% 
  dplyr::select(id, Buurt,Buurt_code) %>% 
  mutate(count=1) %>% 
  st_drop_geometry()

listings.sf <- left_join(listings.sf, listing.sf.neighbor2,by='id')

neighbo2.count <- listings.sf %>% 
  group_by(Buurt) %>% 
  summarise(airbnb.number = sum(count)) %>% 
  dplyr::select(Buurt, airbnb.number) %>% 
  st_drop_geometry()

neighbor2 <- left_join(neighbor2,neighbo2.count,by="Buurt")

neighbor2 %>% ggplot() + 
      geom_sf(aes(fill = airbnb.number), color = 'white') +
      scale_fill_gradient(low = '#f3c226', high = palette5[5],
                          name = "Airbnb Number") +
      labs(title = "Airbnb Number by ZIP") +
      mapTheme

ggplot()+
  geom_sf(data = neighbor2, aes(fill=q5(airbnb.number)),color='transparent')+
  scale_fill_manual(values = palette5,
                     labels = qBr(neighbor2,'airbnb.number'),)+
  labs(title = "Airbnb Number by neighborhood")+
  mapTheme

Figure above also shows that Airbnb housings cluster at the center of the city.

4. Features Engineering

4.1 Physical Features

4.1.1 Basic Features

Basic features such as beds, bedrooms, bathrooms are the most relevant features to price and revenue. Here is a summary of these features:

table.basic <- details.sf %>% 
  st_drop_geometry() %>% 
  dplyr::select(price ,beds, bedrooms, bathrooms, accommodates)

stargazer(as.data.frame(table.basic),
          type = "text",
          title ="Table 1. Summary of Basic Features",
          single.row = TRUE,
          out.header = TRUE)
## 
## Table 1. Summary of Basic Features
## =============================================================
## Statistic   N     Mean   St. Dev. Min Pctl(25) Pctl(75)  Max 
## -------------------------------------------------------------
## price     20,011 150.260 102.648   0     96      175    3,900
## beds      20,011  1.850   1.390    1     1        2      32  
## -------------------------------------------------------------

4.1.2 Amenities

Apart from common amenities, some of them are adding more value to the property. As shown in the plot below, properties with pools, fireplaces, parking and kithcens generally have higher price. Also, we counted the number of amenities listes by the host, we’ll later use a correlation matrix to test if it is influencing the price.

#pool
details.sf <- details.sf %>%
  mutate(pool = ifelse(str_detect(amenities, "Pool"), "Pool", "No Pool"))

#Paid parking off premises
details.sf <- details.sf %>%
  mutate(parking = ifelse(str_detect(amenities, "Paid parking off premises"), 
                          "Parking", "No Parking"))

#Indoor fireplace
details.sf <- details.sf %>%
  mutate(fireplace = ifelse(str_detect(amenities, "Indoor fireplace"), 
                          "Fireplace", "No Fireplace"))

#Waterfront
details.sf <- details.sf %>%
  mutate(waterfront = ifelse(str_detect(amenities, "Waterfront"), 
                          "waterfront", "No waterfront"))

#Kitchen
details.sf <- details.sf %>%
  mutate(kitchen = ifelse(str_detect(amenities, "Kitchen"), 
                          "kitchen", "No kitchen"))

#Air conditioning
details.sf <- details.sf %>%
  mutate(AC = ifelse(str_detect(amenities, "Air conditioning"), 
                          "AC", "No AC"))
amenitie_vars <- c('pool','parking','fireplace','waterfront','kitchen','AC')
plotList <- list()

for (i in amenitie_vars){
plotList[[i]] <- 
  details.sf %>%st_drop_geometry() %>% 
  dplyr::select(price,i) %>%
  filter(price<500) %>% 
  gather(Variable, value, -price) %>%
    ggplot(aes(value, price, fill=value)) + 
      #geom_bar(position = "dodge", stat = "summary", fun.y = "mean") + 
      geom_boxplot()+
      scale_fill_manual(values = palette2) +
      labs(x="value", y="price", 
           title = i) +
      theme(legend.position = "none")+
  plotTheme()
}

do.call(grid.arrange,c(plotList, ncol = 3, top = "Amenities' influence on Price"))

number of amenities listed

library(stringr)

details.sf <- details.sf %>% 
  mutate(amenities.number = str_count(amenities,",")+1)

4.2 Neighborhood Effect

Sharing the same location and similar spatial pattern, there is a spatial effect on the prediction model. We used two geogaphies to account for the neighborhood effect, and test which one is of larger influence. Neighborhood boundaries are defined by Zipcode 4 and Zipcode 6, the former devides Amsterdam into 22 neighborhoods and the latter 481.

details.sf.neighbor <- st_intersection(details.sf,zipcode4) %>% 
  dplyr::select(id, Postcode4) %>% 
  st_drop_geometry()

details.sf <- left_join(details.sf,details.sf.neighbor,by='id')
neighbor2 <- neighbor2 %>% 
  st_transform(st_crs(listings.sf))

listing.sf.neighbor2 <- st_intersection(listings.sf,neighbor2) %>% 
  dplyr::select(id, Buurt,Buurt_code) %>% 
  mutate(count=1) %>% st_drop_geometry()

detail.sf <- left_join(details.sf,listing.sf.neighbor2, by = "id")

4.3 External Features

Distance to public transportation (the metro stations), public amenities (parks, beaches, supermarkets …) are calculated in this part. Som of them are converted from numeric to categorical features.

4.3.1 Distance to metro

listings$longitude = as.numeric(listings$longitude)
listings$latitude = as.numeric(listings$latitude)
listings.sf<- listings %>% 
  st_as_sf(coords = c( "longitude","latitude"), crs = 4326, agr = "constant")

st_crs(metro) <- st_crs(listings.sf)

st_c <- st_coordinates

details.sf.c <- st_centroid(details.sf)%>%
  st_transform('ESRI:102013')
metro.c <- st_centroid(metro)%>%
  st_transform('ESRI:102013')

details.sf <- details.sf%>%
  mutate(dist.metro =nn_function(st_coordinates(details.sf.c),st_coordinates(metro.c),1))

metro <- metro%>%
  st_transform('ESRI:102013')
listings.sf <- listings.sf%>%
  st_transform('ESRI:102013')

listings <- listings%>%
  mutate(distance_to_metro =nn_function(st_c(listings.sf), st_c(metro),1))

ggplot()+
  geom_sf(data = neighbor2, fill = "grey32", color= "grey40")+
  geom_point(data = listings,
             aes(x= longitude, y = latitude, color = distance_to_metro), 
             fill = "transparent", size = 0.86, alpha = 0.6) +
  scale_colour_viridis(direction = -1,
                       discrete = FALSE, 
                       option = "plasma")+
  geom_sf(data = metro, fill="red")+
  ylim(min(listings$latitude), max(listings$latitude))+
  xlim(min(listings$longitude), max(listings$longitude))+
  labs(title="Choropleth Map - Distance to Metro",
       caption = "Figure xxx")+
  mapTheme

4.3.2 Outside amenities and attractions

Load data

# a polygon
unesco <- 
  st_read('UNESCO/UnescoWerelderfgoed_region.shp') %>%
  st_transform(st_crs(neighborhood))
## Reading layer `UnescoWerelderfgoed_region' from data source `D:\MUSA 508\Final Project\UNESCO\UnescoWerelderfgoed_region.shp' using driver `ESRI Shapefile'
## Simple feature collection with 2 features and 6 fields
## geometry type:  POLYGON
## dimension:      XYZ
## bbox:           xmin: 120080.1 ymin: 485673.3 xmax: 123661.3 ymax: 489070.2
## z_range:        zmin: 0 zmax: 0
## projected CRS:  Amersfoort / RD New
parks <- st_read('parks.json') %>% 
  st_transform(st_crs(neighborhood))
## Reading layer `parks' from data source `D:\MUSA 508\Final Project\parks.json' using driver `GeoJSON'
## Simple feature collection with 122 features and 4 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 4.755128 ymin: 52.27963 xmax: 5.019766 ymax: 52.43052
## geographic CRS: WGS 84
attraction <- st_read('amsterdam-attraction.csv') %>% 
  filter(lat != 'attraction')
## Reading layer `amsterdam-attraction' from data source `D:\MUSA 508\Final Project\amsterdam-attraction.csv' using driver `CSV'
attraction <- st_as_sf(attraction,coords = c('lng','lat'),crs = 4326) %>% 
  st_transform(st_crs(neighborhood))

parks <- attraction %>% 
  filter(subCategory == 'Park')

museum <- attraction %>% 
  filter(subCategory == 'Museum')
#supermarkets
supermarkets <- getbb('Amsterdam') %>% 
  opq() %>% 
  add_osm_feature('shop','supermarket') %>% 
  osmdata_sf() 
supermarkets <- supermarkets$osm_points %>% 
  dplyr::select(geometry)%>% 
  st_transform('ESRI:102013')%>% 
  dplyr::select(geometry) %>% 
  mutate(Legend = "supermarkets")

convenienceshop <- getbb('Amsterdam') %>% 
  opq() %>% 
  add_osm_feature('shop','convenience') %>% 
  osmdata_sf() 
convenienceshop <- convenienceshop$osm_points %>% 
  dplyr::select(geometry)%>% 
  st_transform('ESRI:102013')%>% 
  dplyr::select(geometry) %>% 
  mutate(Legend = "convenienceshop")

mall <- getbb('Amsterdam') %>% 
  opq() %>% 
  add_osm_feature('shop','mall') %>% 
  osmdata_sf() 
mall<- mall$osm_points %>% 
  dplyr::select(geometry)%>% 
  st_transform('ESRI:102013') %>% 
  dplyr::select(geometry) %>% 
  mutate(Legend = "mall")


attraction <- st_as_sf(attraction,coords = c('lng','lat'),crs = 4326) %>% 
  st_transform('ESRI:102013')


plaza <- attraction %>% 
  filter(subCategory == 'Plaza') 


beach <- attraction %>% 
  filter(subCategory == 'Beach') 


nightclub <- attraction %>% 
  filter(subCategory == 'Nightclub') 

Calculate distance to amenities and attractions**

details.sf <- details.sf%>%
  mutate(dist.mall =nn_function(st_coordinates(details.sf.c),st_coordinates(mall),1))

details.sf <- details.sf%>%
  mutate(dist.supermarkets =nn_function(st_coordinates(details.sf.c),
                                        st_coordinates(supermarkets),1))

details.sf <- details.sf%>%
  mutate(dist.convenienceshop =nn_function(st_coordinates(details.sf.c),
                                        st_coordinates(convenienceshop),1))

details.sf <- details.sf%>%
  mutate(dist.museum =nn_function(st_coordinates(details.sf.c),
                                        st_coordinates(museum),1))

details.sf <- details.sf%>%
  mutate(dist.plaza =nn_function(st_coordinates(details.sf.c),
                                        st_coordinates(plaza),1))

details.sf <- details.sf%>%
  mutate(dist.nightclub =nn_function(st_coordinates(details.sf.c),
                                        st_coordinates(nightclub),1))

details.sf <- details.sf%>%
  mutate(dist.beach =nn_function(st_coordinates(details.sf.c),
                                        st_coordinates(beach),1))

details.sf <- details.sf%>%
  mutate(dist.parks =nn_function(st_coordinates(details.sf.c),
                                        st_coordinates(parks),1))

This is a summary of the distance features:

table.distance <- details.sf %>% 
  st_drop_geometry() %>% 
  dplyr::select(c("dist.museum","dist.plaza","dist.nightclub","dist.beach",
                  "dist.metro","dist.supermarkets"))

stargazer(as.data.frame(table.distance),
          type = "text",
          title ="Table 2. Summary of all Distance Features",
          single.row = TRUE,
          out.header = TRUE)
## 
## Table 2. Summary of all Distance Features
## ========================================================================================================
## Statistic           N        Mean      St. Dev.       Min        Pctl(25)      Pctl(75)         Max     
## --------------------------------------------------------------------------------------------------------
## dist.museum       20,011 2,499,865.000 1,896.420 2,490,188.000 2,498,696.000 2,501,137.000 2,506,428.000
## dist.plaza        20,011    558.536     799.228      0.453        218.848       579.526      7,412.628  
## dist.nightclub    20,011   1,004.675    759.914      0.622        530.634      1,305.400     7,560.146  
## dist.beach        20,011   1,978.854    747.704      3.899       1,494.927     2,475.238     6,862.358  
## dist.metro        20,011    302.557     375.550      1.175        141.607       315.447      6,472.553  
## dist.supermarkets 20,011    255.136     210.254      1.963        129.179       314.397      5,427.940  
## --------------------------------------------------------------------------------------------------------
# within UNESCO buffer or not
unesco_buffer <- st_union(unesco)
unesco_buffer <- st_as_sf(unesco_buffer)

details.sf.unesco <- st_intersection(details.sf,unesco_buffer) %>% 
  dplyr::select(id) %>% 
  mutate(Unesco = 'within') %>% 
  st_drop_geometry()

details.sf <- left_join(details.sf,details.sf.unesco,by='id')
details.sf$Unesco <- tidyr::replace_na(details.sf$Unesco,'outside')

4.3.3 Airbnb Hotspots

We used two ways to calculate the Airbnb Hotspots in Amsterdam, one by counting Airbnb number in a fishnet cell; one by setting threshold with Local Moran’s I. This feature partly overlapped with “ditance to center city”, but might give a more nuance view on the clustering of Airbnbs.

#create fishnet

amsterdam.boundary <- st_union(neighborhood) %>% st_transform('ESRI:102013')

fishnet <- 
  st_make_grid(amsterdam.boundary, cellsize = 300) %>%
  st_sf() %>%
  mutate(uniqueID = rownames(.))

fishnet <- fishnet %>% 
  mutate(uniqueID = rownames(.))

fishnet.count <- st_intersection(listings.sf,fishnet) %>% 
  mutate(count = 1) %>% 
  st_drop_geometry() %>% 
  dplyr::select(uniqueID, count) %>% 
  group_by(uniqueID) %>% 
  summarise(countairbnb = sum(count))

airbnb_net <- 
  dplyr::select(listings.sf) %>% 
  mutate(countairbnb = 1) %>% 
  aggregate(., fishnet, sum)

airbnb_net <- airbnb_net %>% 
  mutate(countairbnb = tidyr::replace_na(countairbnb,0),
         uniqueID = row.names(.))


ggplot() +
  geom_sf(data = airbnb_net, aes(fill = countairbnb), color = NA) +
  scale_fill_viridis() +
  labs(title = "Count of airbnb for the fishnet") +
  mapTheme

#Visualize local spatial process of airbnb

final_net <- airbnb_net

final_net.nb <- poly2nb(as_Spatial(airbnb_net), queen=TRUE)
final_net.weights <- nb2listw(final_net.nb, style="W", zero.policy=TRUE)

#Visualize local spatial process of auto Theft
final_net.localMorans <- 
  cbind(
    as.data.frame(localmoran(final_net$countairbnb, final_net.weights)),
    as.data.frame(final_net)) %>% 
    st_sf() %>%
      dplyr::select(airbnb_Count = countairbnb, 
                    Local_Morans_I = Ii, 
                    P_Value = `Pr(z > 0)`) %>%
      mutate(Significant_Hotspots = ifelse(P_Value <= 0.0001, 1, 0)) %>%
      gather(Variable, Value, -geometry)
  
vars <- unique(final_net.localMorans$Variable)
varList <- list()

for(i in vars){
  varList[[i]] <- 
    ggplot() +
      geom_sf(data = filter(final_net.localMorans, Variable == i), 
              aes(fill = Value), colour=NA) +
      scale_fill_viridis(name="") +
      labs(title=i) +
      mapTheme + theme(legend.position="right")}

do.call(grid.arrange,c(varList, ncol = 2, top = "Local Morans I statistics, Amsterdam Airbnb"))

#hotspots by count
hotspot.count <- airbnb_net %>% 
  filter(countairbnb>150)

#hotspot by moran's I
hotspot.moran <- final_net %>% 
  mutate(isSig = 
           ifelse(localmoran(final_net$countairbnb, 
                             final_net.weights)[,5] <= 0.000001, 1, 0)) %>% 
  filter(isSig == 1)

#distance to hotspots
details.sf <- details.sf %>% 
  mutate(dist.hotspot.count = nn_function(st_coordinates(details.sf),
                                          st_coordinates(st_centroid(hotspot.count)), 1),
         dist.hotspot.moran = nn_function(st_coordinates(details.sf),
                                          st_coordinates(st_centroid(hotspot.moran)), 1))

4.4 Descriptions

Some features such as decoration, property area are not directly given out in the dataset. So by analysing the descriptions and names, we hope to use words like “spatious” and “luxurious” to partly represent the housing qualities.

details.sf$name <- as.character(details.sf$name) 



# city center
details.sf <- details.sf %>%
  mutate(name.center = ifelse(str_detect(name, "center")|
                              str_detect(name, "centre")|
                              str_detect(name, "central")|
                              str_detect(name, "jordan")|
                              str_detect(name, "Center")|
                              str_detect(name, "Centre")|
                              str_detect(name, "Central")|
                              str_detect(name, "Jordan")|
                              str_detect(name, "CENTER")|
                              str_detect(name, "CENTRE")|
                              str_detect(name, "CENTRAL")|
                              str_detect(name, "JORDAN"),
                              "Center", "No Center")) 

details.sf <- details.sf %>%
  mutate(name.bright = ifelse(str_detect(name, "bright")|
                              str_detect(name, "luminous")|
                              str_detect(name, "Bright")|
                              str_detect(name, "Luminous")|
                                                            str_detect(name, "BRIGHT")|
                              str_detect(name, "LUMIOUS"),
                              "bright", "not bright")) 

#spacious
details.sf <- details.sf %>%
  mutate(name.spacious = ifelse(str_detect(name, "spacious")|
                                str_detect(name, "large")|
                                str_detect(name, "Spacious")|
                                str_detect(name, "Large")|
                                  str_detect(name, "SPACIOUS")|
                                str_detect(name, "LARGE"),
                              "spacious", "not spacious")) 


#luxurious
details.sf <- details.sf %>%
  mutate(name.luxury = ifelse(str_detect(name, "luxury")|
                              str_detect(name, "luxurious")|
                              str_detect(name, "Luxury")|
                              str_detect(name, "Luxurious")|
                              str_detect(name, "LUXURY")|
                              str_detect(name, "LUXURIOUS"),
                              "luxury", "not luxury")) 

4.5 Correlation analysis

4.5.1 Price

Plot monthly-prices and numeric features Each line represents the correlation of numeric feature and the average price in a month. For each month, the coefficient is slightly different, but not significant. By analysing the data more closely, prices do change among differnt months, but within a relatively small range around 10 euros.

listing_panel2 <- listing_panel %>%
  dplyr::rename(id = listing_id) %>%
  dplyr::select(-month_price) 
  
listing_panel2 <-  
  left_join(listing_panel2, st_drop_geometry(details.sf), by = "id") %>%
  dplyr::select(id, month, each_month_price, bathrooms, bedrooms, beds, review_scores_rating, reviews_per_month, dist.metro) %>%
  gather(-id,-month, -each_month_price, key = "variable", value = "value") %>%
  mutate(value=as.numeric(value))

ggplot()+
  geom_point(data=listing_panel2 %>%
         filter(month ==1 & each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#6f1b17", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==1 & each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#ea483d")+
  geom_point(data=listing_panel2 %>%
         filter(month ==2& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#610f26", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==2& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#df2866")+
  geom_point(data=listing_panel2 %>%
         filter(month ==3& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#380f42", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==3& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#972aaf")+
  geom_point(data=listing_panel2 %>%
         filter(month ==4& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#261646", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==4& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#663bb6")+
  geom_point(data=listing_panel2 %>%
         filter(month ==5& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#191e46", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==5& each_month_price <= 2500), aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#4551b4")+
  geom_point(data=listing_panel2 %>%
         filter(month ==6& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#19387d", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==6& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#4295f2")+
  geom_point(data=listing_panel2 %>%
         filter(month ==7& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#173f7f", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==7& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#3ea8f3")+
  geom_point(data=listing_panel2 %>%
         filter(month ==8& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#184957", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==8& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#41bbd3")+
  geom_point(data=listing_panel2 %>%
         filter(month ==9& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#123832", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==9& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#309587")+
  geom_point(data=listing_panel2 %>%
         filter(month ==10& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#22421e", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==10& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#5aae51")+
  geom_point(data=listing_panel2 %>%
         filter(month ==11& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#364c1d", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==11& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#90c24c")+
  geom_point(data=listing_panel2 %>%
         filter(month ==12& each_month_price <= 2500),aes(x = value, y = each_month_price), color = "#535f18", alpha = 0.26)+
  geom_smooth(data=listing_panel2 %>%
         filter(month ==12& each_month_price <= 2500),aes(x = value, y = each_month_price), method = "lm", se= FALSE, color = "#cddc3f")+
  facet_wrap(~variable, scales = "free")+
  labs(title="Price as a function of numeric variable",
       y="Mean Price each month",
      caption = "Scatterplots of price and numeric variable")+
  plotTheme()

We plot the relationship between average price each month and numeric features. The figure above shows that none of these features have linear relationship with price but price does have connections with some of these features such as the number of bedrooms, distance to metro and the number of reviews per month. There are 12 lines in each scatter plot. Each line represents correlation of a numeric feature to the average price in a month. It can seen that for each numeric feature, the variance of correlation is low, which means the average prices in each month have similar relationships with these numeric features.

4.5.2 Occupancy

Plot monthly-occupancy and numeric features Each line represents the correlation of numeric feature and the average occupancy in a month. Occupancy is much more influenced by the tourism seasonality, the coefficient varied hugely among differnt months.

occupancy2 <- occupancy %>%
  dplyr::rename(id = listing_id) 
  
occupancy2 <-  
  left_join(occupancy2, st_drop_geometry(details.sf), by = "id") %>%
  dplyr::select(id, month, monthly_occupancy, bathrooms, bedrooms, beds, review_scores_rating, reviews_per_month, dist.metro) %>%
  gather(-id,-month, -monthly_occupancy, key = "variable", value = "value")%>%
  mutate(value=as.numeric(value))

ggplot()+
  geom_point(data=occupancy2 %>%
         filter(month ==1),aes(x = value, y = monthly_occupancy), color = "#6f1b17", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==1),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#ea483d")+
  geom_point(data=occupancy2 %>%
         filter(month ==2),aes(x = value, y = monthly_occupancy), color = "#610f26", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==2),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#df2866")+
  geom_point(data=occupancy2 %>%
         filter(month ==3),aes(x = value, y = monthly_occupancy), color = "#380f42", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==3),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#972aaf")+
  geom_point(data=occupancy2 %>%
         filter(month ==4),aes(x = value, y = monthly_occupancy), color = "#261646", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==4),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#663bb6")+
  geom_point(data=occupancy2 %>%
         filter(month ==5),aes(x = value, y = monthly_occupancy), color = "#191e46", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==5),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#4551b4")+
  geom_point(data=occupancy2 %>%
         filter(month ==6),aes(x = value, y = monthly_occupancy), color = "#19387d", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==6),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#4295f2")+
  geom_point(data=occupancy2 %>%
         filter(month ==7),aes(x = value, y = monthly_occupancy), color = "#173f7f", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==7),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#3ea8f3")+
  geom_point(data=occupancy2 %>%
         filter(month ==8),aes(x = value, y = monthly_occupancy), color = "#184957", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==8),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#41bbd3")+
  geom_point(data=occupancy2 %>%
         filter(month ==9),aes(x = value, y = monthly_occupancy), color = "#123832", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==9),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#309587")+
  geom_point(data=occupancy2 %>%
         filter(month ==10),aes(x = value, y = monthly_occupancy), color = "#22421e", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==10),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#5aae51")+
  geom_point(data=occupancy2 %>%
         filter(month ==11),aes(x = value, y = monthly_occupancy), color = "#364c1d", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==11),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#90c24c")+
  geom_point(data=occupancy2 %>%
         filter(month ==12),aes(x = value, y = monthly_occupancy), color = "#535f18", alpha = 0.26)+
  geom_smooth(data=occupancy2 %>%
         filter(month ==12),aes(x = value, y = monthly_occupancy), method = "lm", se= FALSE, color = "#cddc3f")+
  ylim(0,31)+
  facet_wrap(~variable, scales = "free")+
  labs(title="Occupancy as a function of numeric variable",
       y="Mean Occupancy each month",
      caption = "Figure 18. Scatterplots of occupancy and numeric variable")+
  plotTheme()

We plot the relationship between average occupancy each month and numeric features. The figure above shows that none of these features have linear relationship with occupancy and occupancy barely has connections with these features. There are 12 lines in each scatter plot. Each line represents correlation of a numeric feature to the average occupancy in a month. It can seen that for each numeric feature, the variance of correlation is high, which means the average occupancies in each month have different relationships with these numeric features. This indicates that we should fit the model separately for each month.

4.5.3 correlation matrix

By plotting the correlation between numeric features, it is obvious that basic features such as bed number and bedroom number are the determinant of price. Distance to public amenities are not as important as we expected.

We calculate the average price in a month for each listing.

occupancy3 <- occupancy %>%
  dplyr::rename(id = listing_id) 
  
occupancy3 <-  
  left_join(occupancy3, st_drop_geometry(details.sf), by = "id")
listing_panel <- listing_panel%>%
   dplyr::rename(id = listing_id)

revenue_panel <- merge(occupancy3,listing_panel[c("id","month","each_month_price")],by=c("id","month")) %>%
  mutate(revenue = each_month_price*monthly_occupancy)
annualrevenue <- revenue_panel %>%
  group_by(id) %>%
  summarise(annual_revenue = sum(revenue))


annualrevenue<- left_join(details.sf, annualrevenue,by="id")%>%
    filter(!id %in% no_price)%>%
    mutate(bathrooms = as.numeric(bathrooms),
         bedrooms = as.numeric(bedrooms))
numericVars <- 
  select_if(st_drop_geometry(annualrevenue), is.numeric) %>% na.omit() %>% 
  dplyr::select(annual_revenue,price,beds, bedrooms, bathrooms, 
                              minimum_nights,dist.museum,dist.supermarkets,
                              dist.metro,dist.plaza, dist.nightclub,
                              dist.beach, dist.parks,
                              amenities.number)


ggcorrplot(
  round(cor(numericVars), 1), 
  p.mat = cor_pmat(numericVars),
  colors = c("#4757a2", "white", "#E46B45"),
  type="lower",
  insig = "blank") +  
    labs(title = "Correlation across numeric variables") 

Overall, there is none multicollinearity in our regression expect for the connection between distance to museum and distance to parks.

5. Modeling

Modeling approach

Our goal is to predict the annual revenue for a new listing in Amsterdam. Here are two approaches: First is to predict monthly price and occupancy separately and calculate the annual, which includes the following steps:

  1. Create a price panel and an occupancy panel. Both of them list housings’ value (price/occupancy) in a long form.
  2. Separate the dataset into training set and test set.
  3. Fit regression on training set, trying on different groups of features.
  4. Predict on test set.
  5. Calculate the annual revenue for test set based on the predictions.
  6. Calculate the absolute error and absolute percentage error.

Second is to predict annual revenue directly, which includes the following steps:

  1. Use the panel above to calculate annual revenue for each listing.
  2. Separate the dataset into training set and test set.
  3. Fit regression on training set, trying on different groups of features.
  4. Predict on test set.
  5. Calculate the absolute error and absolute percentage error.

The features we use are mainly hosts’ input and the houses’ exposure to amenities, attractions, etc. Though new listing has no previous price and we cannot add time lag, we take spatial effects into consideration, adding neighborhoods effect and spatial lag as features to improve both accuracy and generalizability.

We show both approaches in our report and compare their performances in prediction.

5.1 Predict Price & Occupancy

5.1.1 Calculate price spatial lag

To take spatial effect into consideration, we calculate the mean price of 5 nearest listings for each listing and name it lag price.

revenue_panel <- merge(revenue_panel, details[c("id","longitude","latitude")], by=c("id"))

revenue_panel.sf <- 
  st_as_sf(revenue_panel,coords = c('longitude','latitude'),crs = 4326) %>% 
  st_transform(st_crs(neighborhood))

#Calculate for each month-------------------------------------------------------
#January
Jan_price <- revenue_panel.sf %>%
  filter(month == 1)

coords <- st_coordinates(Jan_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Jan_price$lagPrice <- lag.listw(spatialWeights, Jan_price$each_month_price)


#February
Feb_price <- revenue_panel.sf %>%
  filter(month == 2)

coords <- st_coordinates(Feb_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Feb_price$lagPrice <- lag.listw(spatialWeights, Feb_price$each_month_price)


#March
Mar_price <- revenue_panel.sf %>%
  filter(month == 3)

coords <- st_coordinates(Mar_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Mar_price$lagPrice <- lag.listw(spatialWeights, Mar_price$each_month_price)


#April
Apr_price <- revenue_panel.sf %>%
  filter(month == 4)

coords <- st_coordinates(Apr_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Apr_price$lagPrice <- lag.listw(spatialWeights, Apr_price$each_month_price)


#May
May_price <- revenue_panel.sf %>%
  filter(month == 5)

coords <- st_coordinates(May_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

May_price$lagPrice <- lag.listw(spatialWeights, May_price$each_month_price)


#June
Jun_price <- revenue_panel.sf %>%
  filter(month == 6)

coords <- st_coordinates(Jun_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Jun_price$lagPrice <- lag.listw(spatialWeights, Jun_price$each_month_price)


#Jul
Jul_price <- revenue_panel.sf %>%
  filter(month == 7)

coords <- st_coordinates(Jul_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Jul_price$lagPrice <- lag.listw(spatialWeights, Jul_price$each_month_price)


#August
Aug_price <- revenue_panel.sf %>%
  filter(month == 8)

coords <- st_coordinates(Aug_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Aug_price$lagPrice <- lag.listw(spatialWeights, Aug_price$each_month_price)


#September
Sep_price <- revenue_panel.sf %>%
  filter(month == 9)

coords <- st_coordinates(Sep_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Sep_price$lagPrice <- lag.listw(spatialWeights, Sep_price$each_month_price)


#October
Oct_price <- revenue_panel.sf %>%
  filter(month == 10)

coords <- st_coordinates(Oct_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Oct_price$lagPrice <- lag.listw(spatialWeights, Oct_price$each_month_price)


#November
Nov_price <- revenue_panel.sf %>%
  filter(month == 11)

coords <- st_coordinates(Nov_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Nov_price$lagPrice <- lag.listw(spatialWeights, Nov_price$each_month_price)


#December
Dec_price <- revenue_panel.sf %>%
  filter(month == 12)

coords <- st_coordinates(Dec_price) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")

Dec_price$lagPrice <- lag.listw(spatialWeights, Dec_price$each_month_price)

#----------------------------------------------------------------------
price_panel_lag <- rbind(Jan_price, Feb_price, Mar_price, Apr_price, May_price, Jun_price, Jul_price, Aug_price, Sep_price, Oct_price, Nov_price, Dec_price)
ggplot(price_panel_lag )+
  geom_point(aes(x = lagPrice, y = each_month_price), alpha = 0.26)+
  geom_smooth(aes(x = lagPrice, y =each_month_price), method = "lm", se= FALSE, color = "orange")+
  labs(title="Price as a function of lagPrice",
      caption = "Figure xx. Scatterplots of Price and lagPrice")+
  plotTheme()

From the figure above, we know that though price has correlation with lag price, their correlation is not that strong. Obviously, some listings with high prices are surrounded by houses with much lower prices. For these listings (both high-price and low-price), lag price might be a misleading predictor. If we ignore these data, we will find that most listings’ prices are similar to the nearby.

5.1.2 Price Regression

set.seed(1234)

month.var <- c(1:12)

Price.monthList <- list()
ams.train <- list()
ams.test <- list()
ams.test.prediction <- list()
ams.test.table <- list()

price_panel_lag <- merge(price_panel_lag,listing.sf.neighbor2[c("id", "Buurt")], by = "id")

Jan_price <- st_drop_geometry(price_panel_lag)%>%
  filter(month == 1)

inTrain <- createDataPartition(
              y = paste(Jan_price$pool,Jan_price$Buurt,Jan_price$property_type,
                        Jan_price$host_is_superhost), 
              p = .60, list = FALSE)

for (i in month.var){
Price.monthList[[i]] <- 
  st_drop_geometry(price_panel_lag) %>% 
  filter(month == i) 

ams.train[[i]] <- Price.monthList[[i]][inTrain,] 
ams.test[[i]]  <- Price.monthList[[i]][-inTrain,]

reg.price <- lm(each_month_price ~ ., 
                data = ams.train[[i]] %>% 
                dplyr::select(each_month_price, beds, bedrooms, bathrooms, accommodates,
                              pool, parking, kitchen, AC, fireplace,
                              Buurt,host_is_superhost,
                              room_type,property_type,bed_type,
                              minimum_nights,dist.museum,dist.supermarkets,
                              Unesco,dist.metro,dist.plaza, dist.nightclub,
                              dist.beach, dist.parks,
                              name.bright, name.spacious,name.luxury,
                              amenities.number,lagPrice))



ams.test.prediction[[i]] <-
  ams.test[[i]] %>%
  mutate(price.Predict =  predict(reg.price, ams.test[[i]]),
         price.AbsError = abs(each_month_price - price.Predict))

if(i ==1){
    ams.test.table.all <- ams.test.prediction[[i]]
    }else{
      ams.test.table[[i]] <- ams.test.prediction[[i]]
      ams.test.table.all <- rbind(ams.test.table.all,ams.test.table[[i]])
    }
      
}
stargazer(reg.price, type = "text" ,single.row = FALSE, digits = 3,no.space = FALSE)
## 
## =======================================================================
##                                                 Dependent variable:    
##                                             ---------------------------
##                                                  each_month_price      
## -----------------------------------------------------------------------
## beds                                                  -4.282*          
##                                                       (2.290)          
##                                                                        
## bedrooms0                                             -7.799           
##                                                      (97.931)          
##                                                                        
## bedrooms1                                             -3.058           
##                                                      (97.729)          
##                                                                        
## bedrooms10                                           -361.143*         
##                                                      (214.845)         
##                                                                        
## bedrooms11                                          -465.513**         
##                                                      (204.740)         
##                                                                        
## bedrooms12                                          -979.836***        
##                                                      (179.288)         
##                                                                        
## bedrooms2                                             24.116           
##                                                      (97.804)          
##                                                                        
## bedrooms3                                             38.124           
##                                                      (97.982)          
##                                                                        
## bedrooms4                                             82.060           
##                                                      (98.509)          
##                                                                        
## bedrooms5                                             76.715           
##                                                      (101.549)         
##                                                                        
## bedrooms6                                             15.157           
##                                                      (113.316)         
##                                                                        
## bedrooms7                                           504.975***         
##                                                      (144.207)         
##                                                                        
## bedrooms8                                           -514.822***        
##                                                      (152.477)         
##                                                                        
## bedrooms9                                            -344.359*         
##                                                      (203.917)         
##                                                                        
## bathrooms0.0                                          -76.697          
##                                                      (70.481)          
##                                                                        
## bathrooms0.5                                          -88.476          
##                                                      (60.055)          
##                                                                        
## bathrooms1.0                                          -59.145          
##                                                      (55.886)          
##                                                                        
## bathrooms1.5                                          -49.244          
##                                                      (56.059)          
##                                                                        
## bathrooms10.0                                          8.168           
##                                                      (182.690)         
##                                                                        
## bathrooms100.5                                       -116.248          
##                                                      (176.849)         
##                                                                        
## bathrooms15.0                                        -137.624          
##                                                      (174.450)         
##                                                                        
## bathrooms2.0                                          -34.299          
##                                                      (56.360)          
##                                                                        
## bathrooms2.5                                          -21.244          
##                                                      (57.929)          
##                                                                        
## bathrooms3.0                                          28.429           
##                                                      (61.656)          
##                                                                        
## bathrooms3.5                                          83.352           
##                                                      (70.371)          
##                                                                        
## bathrooms4.0                                        315.551***         
##                                                      (91.468)          
##                                                                        
## bathrooms4.5                                          -8.929           
##                                                      (180.469)         
##                                                                        
## bathrooms5.0                                          82.262           
##                                                      (195.918)         
##                                                                        
## bathrooms5.5                                       1,481.201***        
##                                                      (178.642)         
##                                                                        
## bathrooms7.0                                         -107.164          
##                                                      (174.378)         
##                                                                        
## bathrooms8.0                                        -285.334**         
##                                                      (135.676)         
##                                                                        
## accommodates10                                      209.175***         
##                                                      (53.593)          
##                                                                        
## accommodates11                                      418.217***         
##                                                      (122.176)         
##                                                                        
## accommodates12                                      196.275***         
##                                                      (48.337)          
##                                                                        
## accommodates14                                      300.694***         
##                                                      (94.437)          
##                                                                        
## accommodates16                                      783.509***         
##                                                      (67.788)          
##                                                                        
## accommodates17                                                         
##                                                                        
##                                                                        
## accommodates2                                          7.773           
##                                                      (10.631)          
##                                                                        
## accommodates3                                          8.752           
##                                                      (11.937)          
##                                                                        
## accommodates4                                        35.130***         
##                                                      (11.786)          
##                                                                        
## accommodates5                                        33.398**          
##                                                      (16.250)          
##                                                                        
## accommodates6                                        94.857***         
##                                                      (16.171)          
##                                                                        
## accommodates7                                        79.946**          
##                                                      (31.622)          
##                                                                        
## accommodates8                                       107.201***         
##                                                      (26.719)          
##                                                                        
## accommodates9                                         25.491           
##                                                      (105.262)         
##                                                                        
## poolPool                                               1.631           
##                                                      (19.416)          
##                                                                        
## parkingParking                                         2.136           
##                                                       (3.894)          
##                                                                        
## kitchenNo kitchen                                    -13.440**         
##                                                       (6.013)          
##                                                                        
## ACNo AC                                              -14.540**         
##                                                       (7.282)          
##                                                                        
## fireplaceNo Fireplace                               -18.099***         
##                                                       (6.495)          
##                                                                        
## BuurtAalsmeerwegbuurt West                             1.403           
##                                                      (28.709)          
##                                                                        
## BuurtAlexanderplein e.o.                              -47.035          
##                                                      (70.319)          
##                                                                        
## BuurtAmstel III deel A/B Noord                        -4.541           
##                                                      (191.532)         
##                                                                        
## BuurtAmstelglorie                                     -1.654           
##                                                      (68.502)          
##                                                                        
## BuurtAmstelkwartier Noord                             -32.537          
##                                                      (45.589)          
##                                                                        
## BuurtAmstelkwartier West                              -72.527          
##                                                      (83.942)          
##                                                                        
## BuurtAmstelkwartier Zuid                              -61.646          
##                                                      (170.755)         
##                                                                        
## BuurtAmstelpark                                       -61.556          
##                                                      (173.435)         
##                                                                        
## BuurtAmstelveldbuurt                                  46.734           
##                                                      (49.255)          
##                                                                        
## BuurtAmsterdamse Bos                                  -62.809          
##                                                      (67.388)          
##                                                                        
## BuurtAmsterdamse Poort                                -8.093           
##                                                      (95.469)          
##                                                                        
## BuurtAndreasterrein                                   -43.474          
##                                                      (61.624)          
##                                                                        
## BuurtAnjeliersbuurt Noord                             -27.695          
##                                                      (57.915)          
##                                                                        
## BuurtAnjeliersbuurt Zuid                               2.653           
##                                                      (55.467)          
##                                                                        
## BuurtArchitectenbuurt                                 -70.523          
##                                                      (48.788)          
##                                                                        
## BuurtBalboaplein e.o.                                 -52.967          
##                                                      (39.152)          
##                                                                        
## BuurtBanne Noordoost                                  -43.423          
##                                                      (88.954)          
##                                                                        
## BuurtBanne Noordwest                                  -18.979          
##                                                      (93.061)          
##                                                                        
## BuurtBanne Zuidoost                                   -63.663          
##                                                      (73.191)          
##                                                                        
## BuurtBanne Zuidwest                                   -55.083          
##                                                      (79.141)          
##                                                                        
## BuurtBanpleinbuurt                                    -5.583           
##                                                      (55.285)          
##                                                                        
## BuurtBedrijvencentrum Osdorp                          -87.002          
##                                                      (167.991)         
##                                                                        
## BuurtBedrijvencentrum Westerkwartier                 -108.398          
##                                                      (82.790)          
##                                                                        
## BuurtBedrijvengebied Cruquiusweg                     -104.845          
##                                                      (88.915)          
##                                                                        
## BuurtBedrijvengebied Veelaan                          -69.415          
##                                                      (167.233)         
##                                                                        
## BuurtBedrijvengebied Zeeburgerkade                    -70.630          
##                                                      (99.810)          
##                                                                        
## BuurtBedrijvenpark Lutkemeer                         -113.389          
##                                                      (171.356)         
##                                                                        
## BuurtBedrijventerrein Hamerstraat                     -95.012          
##                                                      (72.679)          
##                                                                        
## BuurtBedrijventerrein Landlust                        -62.578          
##                                                      (58.932)          
##                                                                        
## BuurtBedrijventerrein Schinkel                        -48.076          
##                                                      (44.330)          
##                                                                        
## BuurtBeethovenbuurt                                   -54.068          
##                                                      (56.290)          
##                                                                        
## BuurtBegijnhofbuurt                                   30.236           
##                                                      (60.071)          
##                                                                        
## BuurtBelgi< U+00EB> plein e.o.                        -81.041          
##                                                      (100.056)         
##                                                                        
## BuurtBellamybuurt Noord                               -23.182          
##                                                      (36.986)          
##                                                                        
## BuurtBellamybuurt Zuid                                -33.517          
##                                                      (34.305)          
##                                                                        
## BuurtBertelmanpleinbuurt                               4.467           
##                                                      (52.811)          
##                                                                        
## BuurtBetondorp                                        -68.415          
##                                                      (58.033)          
##                                                                        
## BuurtBG-terrein e.o.                                  25.034           
##                                                      (55.346)          
##                                                                        
## BuurtBijlmermuseum Noord                              -38.851          
##                                                      (120.521)         
##                                                                        
## BuurtBijlmermuseum Zuid                              -105.384          
##                                                      (112.102)         
##                                                                        
## BuurtBijlmerpark Oost                                 193.697          
##                                                      (143.177)         
##                                                                        
## BuurtBlauwe Zand                                      -88.720          
##                                                      (57.982)          
##                                                                        
## BuurtBloemenbuurt Noord                               -16.768          
##                                                      (63.634)          
##                                                                        
## BuurtBloemenbuurt Zuid                                -58.400          
##                                                      (61.171)          
##                                                                        
## BuurtBloemgrachtbuurt                                 -25.065          
##                                                      (52.890)          
##                                                                        
## BuurtBorgerbuurt                                      -32.807          
##                                                      (34.797)          
##                                                                        
## BuurtBorneo                                           -60.796          
##                                                      (40.743)          
##                                                                        
## BuurtBosleeuw                                         -75.038          
##                                                      (47.460)          
##                                                                        
## BuurtBretten Oost                                     -29.310          
##                                                      (174.486)         
##                                                                        
## BuurtBuiksloterbreek                                  -42.325          
##                                                      (90.105)          
##                                                                        
## BuurtBuiksloterdijk West                              -93.816          
##                                                      (93.937)          
##                                                                        
## BuurtBuiksloterham                                    -51.745          
##                                                      (84.852)          
##                                                                        
## BuurtBuikslotermeer Noord                             -76.067          
##                                                      (82.133)          
##                                                                        
## BuurtBuikslotermeerplein                             -111.532          
##                                                      (75.971)          
##                                                                        
## BuurtBuitenveldert Midden Zuid                         9.429           
##                                                      (51.298)          
##                                                                        
## BuurtBuitenveldert Oost Midden                        -74.912          
##                                                      (57.612)          
##                                                                        
## BuurtBuitenveldert West Midden                        46.903           
##                                                      (82.367)          
##                                                                        
## BuurtBuitenveldert Zuidoost                           -35.489          
##                                                      (57.465)          
##                                                                        
## BuurtBuitenveldert Zuidwest                           -58.298          
##                                                      (50.562)          
##                                                                        
## BuurtBurgemeester Tellegenbuurt Oost                  -2.106           
##                                                      (37.861)          
##                                                                        
## BuurtBurgemeester Tellegenbuurt West                  -51.520          
##                                                      (40.586)          
##                                                                        
## BuurtBurgwallen Oost                                  81.624           
##                                                      (53.609)          
##                                                                        
## BuurtBuurt 10                                         -93.015          
##                                                      (86.815)          
##                                                                        
## BuurtBuurt 2                                          -25.312          
##                                                      (69.845)          
##                                                                        
## BuurtBuurt 3                                          -91.835          
##                                                      (56.971)          
##                                                                        
## BuurtBuurt 4 Oost                                    -102.163          
##                                                      (73.884)          
##                                                                        
## BuurtBuurt 5 Noord                                    -68.043          
##                                                      (95.191)          
##                                                                        
## BuurtBuurt 5 Zuid                                     -52.555          
##                                                      (74.122)          
##                                                                        
## BuurtBuurt 6                                         -116.742          
##                                                      (106.756)         
##                                                                        
## BuurtBuurt 7                                          -86.289          
##                                                      (96.046)          
##                                                                        
## BuurtBuurt 8                                          -81.229          
##                                                      (77.113)          
##                                                                        
## BuurtBuurt 9                                          -91.370          
##                                                      (127.863)         
##                                                                        
## BuurtBuyskade e.o.                                    -69.324          
##                                                      (45.796)          
##                                                                        
## BuurtCalandlaan/Lelylaan                              -68.502          
##                                                      (72.815)          
##                                                                        
## BuurtCentrumeiland                                    -45.043          
##                                                      (179.191)         
##                                                                        
## BuurtCircus/Kermisbuurt                               -55.716          
##                                                      (130.018)         
##                                                                        
## BuurtCoenhaven/Mercuriushaven                         -69.130          
##                                                      (177.180)         
##                                                                        
## BuurtColumbusplein e.o.                               -21.904          
##                                                      (35.906)          
##                                                                        
## BuurtConcertgebouwbuurt                               -14.515          
##                                                      (39.816)          
##                                                                        
## BuurtCornelis Douwesterrein                           -65.692          
##                                                      (139.038)         
##                                                                        
## BuurtCornelis Schuytbuurt                             17.667           
##                                                      (34.936)          
##                                                                        
## BuurtCornelis Troostbuurt                             -28.753          
##                                                      (33.590)          
##                                                                        
## BuurtCremerbuurt Oost                                 -41.347          
##                                                      (31.053)          
##                                                                        
## BuurtCremerbuurt West                                -52.076*          
##                                                      (28.008)          
##                                                                        
## BuurtCzaar Peterbuurt                                 -33.857          
##                                                      (41.687)          
##                                                                        
## BuurtD-buurt                                          35.162           
##                                                      (95.612)          
##                                                                        
## BuurtDa Costabuurt Noord                              -32.094          
##                                                      (34.073)          
##                                                                        
## BuurtDa Costabuurt Zuid                               -46.831          
##                                                      (34.455)          
##                                                                        
## BuurtDapperbuurt Noord                               -51.418*          
##                                                      (30.973)          
##                                                                        
## BuurtDapperbuurt Zuid                                -55.817*          
##                                                      (31.853)          
##                                                                        
## BuurtDe Aker Oost                                     -41.294          
##                                                      (39.692)          
##                                                                        
## BuurtDe Aker West                                     -86.352          
##                                                      (58.010)          
##                                                                        
## BuurtDe Bongerd                                       -37.062          
##                                                      (76.232)          
##                                                                        
## BuurtDe Eenhoorn                                      -51.685          
##                                                      (53.186)          
##                                                                        
## BuurtDe Kleine Wereld                                -106.586          
##                                                      (87.989)          
##                                                                        
## BuurtDe Klenckebuurt                                  -16.429          
##                                                      (102.912)         
##                                                                        
## BuurtDe Omval                                         -65.182          
##                                                      (69.224)          
##                                                                        
## BuurtDe Punt                                          -71.932          
##                                                      (54.260)          
##                                                                        
## BuurtDe Wester Quartier                               -37.469          
##                                                      (41.018)          
##                                                                        
## BuurtDe Wetbuurt                                      -34.231          
##                                                      (52.329)          
##                                                                        
## BuurtDe Wittenbuurt Noord                             -39.834          
##                                                      (52.573)          
##                                                                        
## BuurtDe Wittenbuurt Zuid                              -81.458          
##                                                      (64.427)          
##                                                                        
## BuurtDelflandpleinbuurt Oost                          -30.609          
##                                                      (70.793)          
##                                                                        
## BuurtDelflandpleinbuurt West                        -101.567**         
##                                                      (41.809)          
##                                                                        
## BuurtDen Texbuurt                                     -41.966          
##                                                      (51.452)          
##                                                                        
## BuurtDiamantbuurt                                     -6.915           
##                                                      (36.169)          
##                                                                        
## BuurtDiepenbrockbuurt                                 -74.545          
##                                                      (88.826)          
##                                                                        
## BuurtDon Bosco                                       -77.977**         
##                                                      (38.000)          
##                                                                        
## BuurtDorp Driemond                                    50.279           
##                                                      (138.425)         
##                                                                        
## BuurtDorp Sloten                                      -54.051          
##                                                      (58.499)          
##                                                                        
## BuurtDriehoekbuurt                                    -18.166          
##                                                      (58.730)          
##                                                                        
## BuurtDuivelseiland                                     6.851           
##                                                      (40.525)          
##                                                                        
## BuurtDurgerdam                                        -63.386          
##                                                      (58.889)          
##                                                                        
## BuurtE-buurt                                          -38.370          
##                                                      (85.445)          
##                                                                        
## BuurtEcowijk                                          -28.865          
##                                                      (60.793)          
##                                                                        
## BuurtEendrachtspark                                   -6.526           
##                                                      (173.795)         
##                                                                        
## BuurtElandsgrachtbuurt                                -1.804           
##                                                      (50.298)          
##                                                                        
## BuurtElzenhagen Noord                                 -64.797          
##                                                      (72.228)          
##                                                                        
## BuurtElzenhagen Zuid                                  -41.386          
##                                                      (173.990)         
##                                                                        
## BuurtEmanuel van Meterenbuurt                         -52.158          
##                                                      (51.541)          
##                                                                        
## BuurtEntrepot-Noordwest                               -35.972          
##                                                      (51.927)          
##                                                                        
## BuurtErasmusparkbuurt Oost                            45.693           
##                                                      (48.365)          
##                                                                        
## BuurtErasmusparkbuurt West                            -56.729          
##                                                      (42.806)          
##                                                                        
## BuurtF-buurt                                          -0.335           
##                                                      (86.423)          
##                                                                        
## BuurtFannius Scholtenbuurt                            -61.424          
##                                                      (48.823)          
##                                                                        
## BuurtFelix Meritisbuurt                               31.263           
##                                                      (51.913)          
##                                                                        
## BuurtFilips van Almondekwartier                       -43.688          
##                                                      (43.453)          
##                                                                        
## BuurtFlevopark                                        -59.166          
##                                                      (88.587)          
##                                                                        
## BuurtFrankendael                                      64.865           
##                                                      (68.162)          
##                                                                        
## BuurtFrans Halsbuurt                                  -13.930          
##                                                      (31.976)          
##                                                                        
## BuurtFrederik Hendrikbuurt Noord                      -39.408          
##                                                      (41.701)          
##                                                                        
## BuurtFrederik Hendrikbuurt Zuidoost                   -58.708          
##                                                      (38.281)          
##                                                                        
## BuurtFrederik Hendrikbuurt Zuidwest                   -40.922          
##                                                      (45.182)          
##                                                                        
## BuurtFrederikspleinbuurt                              55.140           
##                                                      (50.340)          
##                                                                        
## BuurtG-buurt Noord                                    -31.792          
##                                                      (106.742)         
##                                                                        
## BuurtG-buurt Oost                                      1.524           
##                                                      (82.777)          
##                                                                        
## BuurtG-buurt West                                     -6.552           
##                                                      (82.610)          
##                                                                        
## BuurtGaasperdam Noord                                 19.986           
##                                                      (127.772)         
##                                                                        
## BuurtGaasperdam Zuid                                  19.012           
##                                                      (142.559)         
##                                                                        
## BuurtGaasperpark                                      16.169           
##                                                      (153.992)         
##                                                                        
## BuurtGaasperplas                                      44.538           
##                                                      (147.871)         
##                                                                        
## BuurtGein Noordoost                                   46.823           
##                                                      (125.475)         
##                                                                        
## BuurtGein Noordwest                                   -29.806          
##                                                      (135.780)         
##                                                                        
## BuurtGein Zuidwest                                    17.800           
##                                                      (170.730)         
##                                                                        
## BuurtGein Zuioost                                      0.174           
##                                                      (143.856)         
##                                                                        
## BuurtGelderlandpleinbuurt                           -119.643**         
##                                                      (47.359)          
##                                                                        
## BuurtGerard Doubuurt                                  -38.861          
##                                                      (31.747)          
##                                                                        
## BuurtGeuzenhofbuurt                                  -78.577*          
##                                                      (40.630)          
##                                                                        
## BuurtGibraltarbuurt                                   -45.596          
##                                                      (51.100)          
##                                                                        
## BuurtGouden Bocht                                     -39.145          
##                                                      (62.008)          
##                                                                        
## BuurtGroenmarktkadebuurt                              -45.046          
##                                                      (58.139)          
##                                                                        
## BuurtGrunder/Koningshoef                              15.032           
##                                                      (91.839)          
##                                                                        
## BuurtHaarlemmerbuurt Oost                           168.960***         
##                                                      (59.813)          
##                                                                        
## BuurtHaarlemmerbuurt West                             -48.769          
##                                                      (60.210)          
##                                                                        
## BuurtHakfort/Huigenbos                                -11.225          
##                                                      (124.649)         
##                                                                        
## BuurtHarmoniehofbuurt                                 -7.877           
##                                                      (65.501)          
##                                                                        
## BuurtHaveneiland Noord                                30.279           
##                                                      (80.735)          
##                                                                        
## BuurtHaveneiland Noordoost                            -51.718          
##                                                      (63.587)          
##                                                                        
## BuurtHaveneiland Noordwest                            -45.440          
##                                                      (61.558)          
##                                                                        
## BuurtHaveneiland Oost                                 -29.884          
##                                                      (72.248)          
##                                                                        
## BuurtHaveneiland Zuidwest/Rieteiland West             -71.716          
##                                                      (62.505)          
##                                                                        
## BuurtHelmersbuurt Oost                                -44.176          
##                                                      (31.605)          
##                                                                        
## BuurtHemelrijk                                        -14.633          
##                                                      (59.138)          
##                                                                        
## BuurtHemonybuurt                                     -52.585*          
##                                                      (29.626)          
##                                                                        
## BuurtHercules Seghersbuurt                            -19.023          
##                                                      (34.863)          
##                                                                        
## BuurtHet Funen                                        -64.261          
##                                                      (49.554)          
##                                                                        
## BuurtHiltonbuurt                                     -115.666          
##                                                      (86.677)          
##                                                                        
## BuurtHolendrecht Oost                                 -9.168           
##                                                      (126.487)         
##                                                                        
## BuurtHolendrecht West                                 11.873           
##                                                      (196.464)         
##                                                                        
## BuurtHolysloot                                         3.331           
##                                                      (105.911)         
##                                                                        
## BuurtHondecoeterbuurt                                 -18.193          
##                                                      (41.013)          
##                                                                        
## BuurtHoptille                                          3.359           
##                                                      (118.063)         
##                                                                        
## BuurtHouthavens Oost                                  -46.999          
##                                                      (67.308)          
##                                                                        
## BuurtHouthavens West                                  -70.725          
##                                                      (70.874)          
##                                                                        
## BuurtIJplein e.o.                                     -52.800          
##                                                      (49.927)          
##                                                                        
## BuurtIJsbaanpad e.o.                                  -30.657          
##                                                      (52.292)          
##                                                                        
## BuurtIJselbuurt Oost                                  -39.663          
##                                                      (38.682)          
##                                                                        
## BuurtIJselbuurt West                                  -56.446          
##                                                      (42.978)          
##                                                                        
## BuurtJacob Geelbuurt                                  -99.672          
##                                                      (88.165)          
##                                                                        
## BuurtJacques Veldmanbuurt                             -50.879          
##                                                      (38.370)          
##                                                                        
## BuurtJan Maijenbuurt                                  -40.905          
##                                                      (41.809)          
##                                                                        
## BuurtJava-eiland                                    -137.086**         
##                                                      (53.839)          
##                                                                        
## BuurtJohan Jongkindbuurt                              -39.710          
##                                                      (88.024)          
##                                                                        
## BuurtJohannnes Vermeerbuurt                           -4.018           
##                                                      (38.854)          
##                                                                        
## BuurtJohn Franklinbuurt                               -61.730          
##                                                      (43.456)          
##                                                                        
## BuurtJulianapark                                      -51.801          
##                                                      (88.393)          
##                                                                        
## BuurtK-buurt Midden                                   142.691          
##                                                      (142.696)         
##                                                                        
## BuurtK-buurt Zuidoost                                 -39.373          
##                                                      (107.828)         
##                                                                        
## BuurtK-buurt Zuidwest                                 -77.321          
##                                                      (186.749)         
##                                                                        
## BuurtKadijken                                         -12.447          
##                                                      (53.456)          
##                                                                        
## BuurtKadoelen                                         -18.828          
##                                                      (86.597)          
##                                                                        
## BuurtKalverdriehoek                                   -29.309          
##                                                      (57.460)          
##                                                                        
## BuurtKantershof                                       -17.788          
##                                                      (100.954)         
##                                                                        
## BuurtKattenburg                                       -83.419          
##                                                      (58.374)          
##                                                                        
## BuurtKazernebuurt                                     -49.238          
##                                                      (65.612)          
##                                                                        
## BuurtKelbergen                                        -33.115          
##                                                      (128.107)         
##                                                                        
## BuurtKNSM-eiland                                     -88.446*          
##                                                      (45.565)          
##                                                                        
## BuurtKolenkitbuurt Noord                              -81.234          
##                                                      (65.059)          
##                                                                        
## BuurtKolenkitbuurt Zuid                               -85.913          
##                                                      (52.995)          
##                                                                        
## BuurtKoningin Wilhelminaplein                        -87.767*          
##                                                      (46.222)          
##                                                                        
## BuurtKop Zeedijk                                      20.548           
##                                                      (58.900)          
##                                                                        
## BuurtKop Zuidas                                       -94.032          
##                                                      (73.532)          
##                                                                        
## BuurtKortenaerkwartier                                -46.382          
##                                                      (39.752)          
##                                                                        
## BuurtKortvoort                                        24.587           
##                                                      (122.324)         
##                                                                        
## BuurtKromme Mijdrechtbuurt                            -59.686          
##                                                      (42.478)          
##                                                                        
## BuurtL-buurt                                          -0.033           
##                                                      (103.151)         
##                                                                        
## BuurtLaan van Spartaan                                -60.158          
##                                                      (54.038)          
##                                                                        
## BuurtLandelijk gebied Driemond                        134.882          
##                                                      (139.309)         
##                                                                        
## BuurtLandlust Noord                                   -44.239          
##                                                      (50.416)          
##                                                                        
## BuurtLandlust Zuid                                    -36.924          
##                                                      (43.108)          
##                                                                        
## BuurtLangestraat e.o.                                 26.216           
##                                                      (57.315)          
##                                                                        
## BuurtLastage                                          -21.981          
##                                                      (56.319)          
##                                                                        
## BuurtLegmeerpleinbuurt                              153.194***         
##                                                      (35.608)          
##                                                                        
## BuurtLeidsebuurt Noordoost                             1.609           
##                                                      (49.329)          
##                                                                        
## BuurtLeidsebuurt Noordwest                            -9.192           
##                                                      (56.405)          
##                                                                        
## BuurtLeidsebuurt Zuidoost                             -14.323          
##                                                      (55.658)          
##                                                                        
## BuurtLeidsebuurt Zuidwest                             -46.921          
##                                                      (60.553)          
##                                                                        
## BuurtLeidsegracht Noord                               -7.108           
##                                                      (55.951)          
##                                                                        
## BuurtLeidsegracht Zuid                                -1.360           
##                                                      (55.093)          
##                                                                        
## BuurtLeliegracht e.o.                                 -13.215          
##                                                      (55.021)          
##                                                                        
## BuurtLinnaeusparkbuurt                                -43.262          
##                                                      (38.265)          
##                                                                        
## BuurtLizzy Ansinghbuurt                               -47.156          
##                                                      (36.159)          
##                                                                        
## BuurtLoenermark                                       -60.058          
##                                                      (85.109)          
##                                                                        
## BuurtLootsbuurt                                       -29.535          
##                                                      (35.518)          
##                                                                        
## BuurtLouis Crispijnbuurt                              -76.475          
##                                                      (64.369)          
##                                                                        
## BuurtLucas/Andreasziekenhuis e.o.                     -19.468          
##                                                      (83.863)          
##                                                                        
## BuurtMarathonbuurt Oost                               -34.897          
##                                                      (41.280)          
##                                                                        
## BuurtMarathonbuurt West                              -71.994**         
##                                                      (34.793)          
##                                                                        
## BuurtMarcanti                                         -60.840          
##                                                      (54.523)          
##                                                                        
## BuurtMarine-Etablissement                             -80.774          
##                                                      (62.136)          
##                                                                        
## BuurtMarjoleinterrein                                 -16.077          
##                                                      (113.388)         
##                                                                        
## BuurtMarkengouw Midden                                -58.649          
##                                                      (71.584)          
##                                                                        
## BuurtMarkengouw Noord                                 49.874           
##                                                      (128.345)         
##                                                                        
## BuurtMarkengouw Zuid                                 -105.698          
##                                                      (172.290)         
##                                                                        
## BuurtMarkthallen                                      -92.080          
##                                                      (78.675)          
##                                                                        
## BuurtMarnixbuurt Midden                               -31.627          
##                                                      (64.673)          
##                                                                        
## BuurtMarnixbuurt Noord                                -28.677          
##                                                      (62.160)          
##                                                                        
## BuurtMarnixbuurt Zuid                                 -11.870          
##                                                      (60.115)          
##                                                                        
## BuurtMedisch Centrum Slotervaart                      -75.195          
##                                                      (167.229)         
##                                                                        
## BuurtMeer en Oever                                    -80.474          
##                                                      (69.438)          
##                                                                        
## BuurtMercatorpark                                     -46.181          
##                                                      (76.533)          
##                                                                        
## BuurtMiddelveldsche Akerpolder                        -73.965          
##                                                      (79.071)          
##                                                                        
## BuurtMiddenmeer Noord                                -73.742*          
##                                                      (40.023)          
##                                                                        
## BuurtMiddenmeer Zuid                                 -65.575*          
##                                                      (35.463)          
##                                                                        
## BuurtMinervabuurt Midden                              35.249           
##                                                      (47.702)          
##                                                                        
## BuurtMinervabuurt Noord                               63.274           
##                                                      (48.730)          
##                                                                        
## BuurtMinervabuurt Zuid                               -87.568*          
##                                                      (47.479)          
##                                                                        
## BuurtMolenwijk                                        -30.699          
##                                                      (146.658)         
##                                                                        
## BuurtMuseumplein                                      -54.325          
##                                                      (72.465)          
##                                                                        
## BuurtNDSM terrein                                   256.810***         
##                                                      (92.226)          
##                                                                        
## BuurtNes e.o.                                         -26.179          
##                                                      (57.310)          
##                                                                        
## BuurtNieuw Sloten Noordoost                           -58.742          
##                                                      (79.612)          
##                                                                        
## BuurtNieuw Sloten Noordwest                           28.054           
##                                                      (52.534)          
##                                                                        
## BuurtNieuw Sloten Zuidoost                           -116.953          
##                                                      (80.523)          
##                                                                        
## BuurtNieuw Sloten Zuidwest                            -72.400          
##                                                      (64.069)          
##                                                                        
## BuurtNieuwe Diep/Diemerpark                           -56.207          
##                                                      (81.982)          
##                                                                        
## BuurtNieuwe Kerk e.o.                                 -38.704          
##                                                      (56.052)          
##                                                                        
## BuurtNieuwe Meer                                      -57.716          
##                                                      (171.302)         
##                                                                        
## BuurtNieuwe Oosterbegraafplaats                       -92.110          
##                                                      (120.022)         
##                                                                        
## BuurtNieuwendammerdijk Oost                           -69.899          
##                                                      (69.200)          
##                                                                        
## BuurtNieuwendammerdijk Zuid                           -63.782          
##                                                      (86.055)          
##                                                                        
## BuurtNieuwendammmerdijk West                          -62.343          
##                                                      (56.569)          
##                                                                        
## BuurtNieuwendijk Noord                                -16.111          
##                                                      (63.280)          
##                                                                        
## BuurtNieuwmarkt                                       92.070*          
##                                                      (55.065)          
##                                                                        
## BuurtNintemanterrein                                  -88.337          
##                                                      (128.232)         
##                                                                        
## BuurtNoorder IJplas                                   83.883           
##                                                      (290.377)         
##                                                                        
## BuurtNoorderstrook Oost                               -42.108          
##                                                      (172.973)         
##                                                                        
## BuurtNoorderstrook West                               -16.123          
##                                                      (134.515)         
##                                                                        
## BuurtNoordoever Sloterplas                            -56.713          
##                                                      (56.437)          
##                                                                        
## BuurtNoordoostkwadrant Indische buurt                -81.905**         
##                                                      (33.229)          
##                                                                        
## BuurtNoordwestkwadrant Indische buurt Noord          -53.307*          
##                                                      (29.894)          
##                                                                        
## BuurtNoordwestkwadrant Indische buurt Zuid           -65.367**         
##                                                      (30.596)          
##                                                                        
## BuurtOlympisch Stadion e.o.                          -93.596*          
##                                                      (55.320)          
##                                                                        
## BuurtOokmeer                                          -59.453          
##                                                      (92.291)          
##                                                                        
## BuurtOostelijke Handelskade                           -49.669          
##                                                      (66.172)          
##                                                                        
## BuurtOostenburg                                       -27.127          
##                                                      (41.295)          
##                                                                        
## BuurtOosterdokseiland                                 45.423           
##                                                      (77.058)          
##                                                                        
## BuurtOosterpark                                       -16.783          
##                                                      (50.540)          
##                                                                        
## BuurtOosterparkbuurt Noordwest                        -37.238          
##                                                      (30.520)          
##                                                                        
## BuurtOosterparkbuurt Zuidoost                         -42.275          
##                                                      (31.240)          
##                                                                        
## BuurtOosterparkbuurt Zuidwest                         -35.832          
##                                                      (35.109)          
##                                                                        
## BuurtOostoever Sloterplas                             -63.692          
##                                                      (54.085)          
##                                                                        
## BuurtOostpoort                                       -69.129*          
##                                                      (38.291)          
##                                                                        
## BuurtOostzanerdijk                                    -77.077          
##                                                      (108.759)         
##                                                                        
## BuurtOrteliusbuurt Midden                            -78.849*          
##                                                      (41.479)          
##                                                                        
## BuurtOrteliusbuurt Noord                              -64.742          
##                                                      (45.804)          
##                                                                        
## BuurtOrteliusbuurt Zuid                               -49.792          
##                                                      (38.899)          
##                                                                        
## BuurtOsdorp Midden Noord                              -56.360          
##                                                      (80.606)          
##                                                                        
## BuurtOsdorp Midden Zuid                               -34.301          
##                                                      (73.680)          
##                                                                        
## BuurtOsdorp Zuidoost                                  -61.906          
##                                                      (50.321)          
##                                                                        
## BuurtOsdorper Binnenpolder                           -109.020          
##                                                      (92.168)          
##                                                                        
## BuurtOsdorper Bovenpolder                            -109.910          
##                                                      (108.697)         
##                                                                        
## BuurtOsdorpplein e.o.                                 -68.254          
##                                                      (68.662)          
##                                                                        
## BuurtOude Kerk e.o.                                   21.745           
##                                                      (56.376)          
##                                                                        
## BuurtOveramstel                                       -25.204          
##                                                      (177.010)         
##                                                                        
## BuurtOverbraker Binnenpolder                          -78.804          
##                                                      (99.630)          
##                                                                        
## BuurtOverhoeks                                        -96.091          
##                                                      (77.307)          
##                                                                        
## BuurtOvertoomse Veld Noord                           -84.592*          
##                                                      (46.227)          
##                                                                        
## BuurtOvertoomse Veld Zuid                            -79.196*          
##                                                      (46.167)          
##                                                                        
## BuurtP.C. Hooftbuurt                                  33.493           
##                                                      (45.551)          
##                                                                        
## BuurtPapaverweg e.o.                                  -1.720           
##                                                      (63.747)          
##                                                                        
## BuurtParamariboplein e.o.                            -62.525**         
##                                                      (31.157)          
##                                                                        
## BuurtPark de Meer                                     -67.786          
##                                                      (61.566)          
##                                                                        
## BuurtPark Haagseweg                                  -126.367          
##                                                      (121.935)         
##                                                                        
## BuurtParooldriehoek                                   -42.666          
##                                                      (50.023)          
##                                                                        
## BuurtPasseerdersgrachtbuurt                           37.926           
##                                                      (57.063)          
##                                                                        
## BuurtPieter van der Doesbuurt                         -61.746          
##                                                      (42.345)          
##                                                                        
## BuurtPlan van Gool                                    -51.741          
##                                                      (67.691)          
##                                                                        
## BuurtPlanciusbuurt Noord                             199.429**         
##                                                      (79.144)          
##                                                                        
## BuurtPlanciusbuurt Zuid                               -90.205          
##                                                      (131.440)         
##                                                                        
## BuurtPlantage                                         -9.430           
##                                                      (51.378)          
##                                                                        
## BuurtPostjeskade e.o.                                -59.282*          
##                                                      (32.743)          
##                                                                        
## BuurtPrinses Irenebuurt                              -104.080*         
##                                                      (53.677)          
##                                                                        
## BuurtRAI                                              -60.302          
##                                                      (77.332)          
##                                                                        
## BuurtRansdorp                                         13.891           
##                                                      (89.593)          
##                                                                        
## BuurtRapenburg                                        -32.319          
##                                                      (54.965)          
##                                                                        
## BuurtRechte H-buurt                                   25.609           
##                                                      (107.869)         
##                                                                        
## BuurtReguliersbuurt                                   -2.914           
##                                                      (67.726)          
##                                                                        
## BuurtReigersbos Midden                                50.416           
##                                                      (138.296)         
##                                                                        
## BuurtReigersbos Noord                                 38.320           
##                                                      (129.213)         
##                                                                        
## BuurtReigersbos Zuid                                   5.167           
##                                                      (152.092)         
##                                                                        
## BuurtRembrandtpark Noord                              -47.794          
##                                                      (52.395)          
##                                                                        
## BuurtRembrandtpark Zuid                               -27.469          
##                                                      (43.999)          
##                                                                        
## BuurtRembrandtpleinbuurt                              -6.850           
##                                                      (55.021)          
##                                                                        
## BuurtRI Oost terrein                                  -35.220          
##                                                      (48.334)          
##                                                                        
## BuurtRieteiland Oost                                  -45.044          
##                                                      (103.480)         
##                                                                        
## BuurtRietlanden                                       -28.240          
##                                                      (47.668)          
##                                                                        
## BuurtRijnbuurt Midden                                 -58.765          
##                                                      (44.235)          
##                                                                        
## BuurtRijnbuurt Oost                                   -24.277          
##                                                      (41.770)          
##                                                                        
## BuurtRijnbuurt West                                   -20.722          
##                                                      (58.514)          
##                                                                        
## BuurtRobert Scottbuurt Oost                           -68.410          
##                                                      (47.662)          
##                                                                        
## BuurtRobert Scottbuurt West                           -53.322          
##                                                      (46.671)          
##                                                                        
## BuurtRode Kruisbuurt                                 -113.011          
##                                                      (107.268)         
##                                                                        
## BuurtSarphatiparkbuurt                                -18.431          
##                                                      (29.159)          
##                                                                        
## BuurtSarphatistrook                                   -21.387          
##                                                      (47.515)          
##                                                                        
## BuurtScheepvaarthuisbuurt                             -49.939          
##                                                      (55.242)          
##                                                                        
## BuurtScheldebuurt Midden                             -66.610*          
##                                                      (40.432)          
##                                                                        
## BuurtScheldebuurt Oost                                -52.420          
##                                                      (42.676)          
##                                                                        
## BuurtScheldebuurt West                                -56.412          
##                                                      (39.494)          
##                                                                        
## BuurtSchellingwoude Oost                              -74.026          
##                                                      (52.598)          
##                                                                        
## BuurtSchellingwoude West                              -33.126          
##                                                      (77.754)          
##                                                                        
## BuurtSchinkelbuurt Noord                             -60.626**         
##                                                      (29.268)          
##                                                                        
## BuurtSchinkelbuurt Zuid                               -47.684          
##                                                      (38.602)          
##                                                                        
## BuurtSchipluidenbuurt                                 -80.845          
##                                                      (87.593)          
##                                                                        
## BuurtScience Park Noord                               -85.771          
##                                                      (52.252)          
##                                                                        
## BuurtScience Park Zuid                                -31.913          
##                                                      (121.237)         
##                                                                        
## BuurtSlotermeer Zuid                                  -26.295          
##                                                      (61.936)          
##                                                                        
## BuurtSloterpark                                      -109.692          
##                                                      (80.759)          
##                                                                        
## BuurtSloterweg e.o.                                   -29.744          
##                                                      (76.438)          
##                                                                        
## BuurtSpaarndammerbuurt Midden                         -48.619          
##                                                      (66.511)          
##                                                                        
## BuurtSpaarndammerbuurt Noordoost                      -72.359          
##                                                      (59.274)          
##                                                                        
## BuurtSpaarndammerbuurt Noordwest                      -65.514          
##                                                      (71.583)          
##                                                                        
## BuurtSpaarndammerbuurt Zuidoost                       -46.163          
##                                                      (60.380)          
##                                                                        
## BuurtSpaarndammerbuurt Zuidwest                       -61.116          
##                                                      (57.782)          
##                                                                        
## BuurtSpiegelbuurt                                     -27.023          
##                                                      (51.038)          
##                                                                        
## BuurtSporenburg                                       -55.554          
##                                                      (41.916)          
##                                                                        
## BuurtSportpark Middenmeer Noord                       -32.591          
##                                                      (99.842)          
##                                                                        
## BuurtSportpark Middenmeer Zuid                       -131.982          
##                                                      (99.289)          
##                                                                        
## BuurtSportpark Voorland                               85.830           
##                                                      (167.968)         
##                                                                        
## BuurtSpuistraat Noord                                  0.904           
##                                                      (56.516)          
##                                                                        
## BuurtSpuistraat Zuid                                  32.642           
##                                                      (57.454)          
##                                                                        
## BuurtStaalmanbuurt                                    -59.548          
##                                                      (40.667)          
##                                                                        
## BuurtStaatsliedenbuurt Noordoost                      -68.186          
##                                                      (54.555)          
##                                                                        
## BuurtStationsplein e.o.                               -46.282          
##                                                      (173.062)         
##                                                                        
## BuurtSteigereiland Noord                              -49.491          
##                                                      (55.328)          
##                                                                        
## BuurtSteigereiland Zuid                               -34.645          
##                                                      (47.806)          
##                                                                        
## BuurtSurinamepleinbuurt                              -78.896**         
##                                                      (36.776)          
##                                                                        
## BuurtSwammerdambuurt                                  -39.870          
##                                                      (31.832)          
##                                                                        
## BuurtTeleport                                        -118.323          
##                                                      (130.491)         
##                                                                        
## BuurtTerrasdorp                                       -31.525          
##                                                      (77.992)          
##                                                                        
## BuurtTransvaalbuurt Oost                              -36.171          
##                                                      (31.724)          
##                                                                        
## BuurtTransvaalbuurt West                             -66.769*          
##                                                      (34.941)          
##                                                                        
## BuurtTrompbuurt                                       -53.360          
##                                                      (39.859)          
##                                                                        
## BuurtTuindorp Amstelstation                           -15.232          
##                                                      (71.493)          
##                                                                        
## BuurtTuindorp Frankendael                           -109.289**         
##                                                      (53.993)          
##                                                                        
## BuurtTuindorp Nieuwendam Oost                         -58.542          
##                                                      (57.391)          
##                                                                        
## BuurtTuindorp Nieuwendam West                         -53.332          
##                                                      (69.638)          
##                                                                        
## BuurtTuindorp Oostzaan Oost                           -37.804          
##                                                      (87.327)          
##                                                                        
## BuurtTuindorp Oostzaan West                           -48.146          
##                                                      (128.818)         
##                                                                        
## BuurtTwiske Oost                                      111.260          
##                                                      (147.206)         
##                                                                        
## BuurtTwiske West                                      -12.236          
##                                                      (105.979)         
##                                                                        
## BuurtUilenburg                                        -29.799          
##                                                      (55.766)          
##                                                                        
## BuurtUtrechtsebuurt Zuid                              21.866           
##                                                      (51.339)          
##                                                                        
## BuurtValeriusbuurt Oost                                6.817           
##                                                      (44.294)          
##                                                                        
## BuurtValeriusbuurt West                               -49.855          
##                                                      (37.695)          
##                                                                        
## BuurtValkenburg                                       -36.317          
##                                                      (57.688)          
##                                                                        
## BuurtVan Brakelkwartier                               -39.536          
##                                                      (51.601)          
##                                                                        
## BuurtVan der Helstpleinbuurt                          -44.095          
##                                                      (30.487)          
##                                                                        
## BuurtVan der Kunbuurt                                 -20.581          
##                                                      (75.096)          
##                                                                        
## BuurtVan der Pekbuurt                                 -50.042          
##                                                      (53.066)          
##                                                                        
## BuurtVan Loonbuurt                                    -7.171           
##                                                      (50.327)          
##                                                                        
## BuurtVan Tuyllbuurt                                   -44.009          
##                                                      (33.527)          
##                                                                        
## BuurtVelserpolder West                                28.348           
##                                                      (80.615)          
##                                                                        
## BuurtVeluwebuurt                                      -60.794          
##                                                      (69.439)          
##                                                                        
## BuurtVenserpolder Oost                                -28.936          
##                                                      (73.363)          
##                                                                        
## BuurtVliegenbos                                       -6.294           
##                                                      (59.410)          
##                                                                        
## BuurtVogelbuurt Noord                                 -71.456          
##                                                      (61.340)          
##                                                                        
## BuurtVogelbuurt Zuid                                  -18.066          
##                                                      (47.588)          
##                                                                        
## BuurtVogeltjeswei                                     51.135           
##                                                      (139.185)         
##                                                                        
## BuurtVondelpark Oost                                  -26.121          
##                                                      (85.869)          
##                                                                        
## BuurtVondelpark West                                  -55.508          
##                                                      (55.020)          
##                                                                        
## BuurtVondelparkbuurt Midden                           -24.871          
##                                                      (37.544)          
##                                                                        
## BuurtVondelparkbuurt Oost                             -35.075          
##                                                      (36.961)          
##                                                                        
## BuurtVondelparkbuurt West                             -26.193          
##                                                      (31.084)          
##                                                                        
## BuurtVU-kwartier                                      -52.747          
##                                                      (88.617)          
##                                                                        
## BuurtWalvisbuurt                                      -13.805          
##                                                      (110.385)         
##                                                                        
## BuurtWaterloopleinbuurt                               -20.338          
##                                                      (59.915)          
##                                                                        
## BuurtWeesperbuurt                                     -16.532          
##                                                      (46.578)          
##                                                                        
## BuurtWeespertrekvaart                                 -72.137          
##                                                      (65.568)          
##                                                                        
## BuurtWeesperzijde Midden/Zuid                         -49.829          
##                                                      (34.464)          
##                                                                        
## BuurtWerengouw Midden                                 -50.120          
##                                                      (61.254)          
##                                                                        
## BuurtWerengouw Noord                                  -36.424          
##                                                      (126.446)         
##                                                                        
## BuurtWerengouw Zuid                                   -75.029          
##                                                      (69.349)          
##                                                                        
## BuurtWestelijke eilanden                              -9.536           
##                                                      (61.139)          
##                                                                        
## BuurtWesterdokseiland                                 -73.945          
##                                                      (54.242)          
##                                                                        
## BuurtWestergasfabriek                                 -42.459          
##                                                      (70.428)          
##                                                                        
## BuurtWesterstaatsman                                  -48.431          
##                                                      (47.938)          
##                                                                        
## BuurtWestlandgrachtbuurt                             -70.155**         
##                                                      (30.578)          
##                                                                        
## BuurtWeteringbuurt                                    -3.612           
##                                                      (47.019)          
##                                                                        
## BuurtWG-terrein                                       -23.361          
##                                                      (32.934)          
##                                                                        
## BuurtWielingenbuurt                                   -61.859          
##                                                      (45.737)          
##                                                                        
## BuurtWildeman                                         -47.979          
##                                                      (62.072)          
##                                                                        
## BuurtWillemsparkbuurt Noord                           -13.429          
##                                                      (38.347)          
##                                                                        
## BuurtWillibrordusbuurt                                -26.745          
##                                                      (32.094)          
##                                                                        
## BuurtWittenburg                                       -32.771          
##                                                      (45.123)          
##                                                                        
## BuurtWoon- en Groengebied Sloterdijk                  -67.715          
##                                                      (90.874)          
##                                                                        
## BuurtZaagpoortbuurt                                   -49.088          
##                                                      (61.330)          
##                                                                        
## BuurtZamenhofstraat e.o.                              -66.876          
##                                                      (170.451)         
##                                                                        
## BuurtZeeburgerdijk Oost                               -72.896          
##                                                      (121.748)         
##                                                                        
## BuurtZeeburgereiland Noordoost                        -3.775           
##                                                      (102.027)         
##                                                                        
## BuurtZeeburgereiland Noordwest                        -27.687          
##                                                      (89.028)          
##                                                                        
## BuurtZeeburgereiland Zuidoost                         -36.626          
##                                                      (169.115)         
##                                                                        
## BuurtZeeburgereiland Zuidwest                         -69.398          
##                                                      (68.988)          
##                                                                        
## BuurtZeeheldenbuurt                                   -74.999          
##                                                      (55.385)          
##                                                                        
## BuurtZorgvlied                                        -94.071          
##                                                      (104.717)         
##                                                                        
## BuurtZuidas Noord                                     -83.150          
##                                                      (88.421)          
##                                                                        
## BuurtZuidas Zuid                                      -26.104          
##                                                      (61.132)          
##                                                                        
## BuurtZuiderhof                                        -75.118          
##                                                      (168.428)         
##                                                                        
## BuurtZuiderkerkbuurt                                   2.304           
##                                                      (54.491)          
##                                                                        
## BuurtZuidoostkwadrant Indische buurt                 -70.646*          
##                                                      (37.445)          
##                                                                        
## BuurtZuidwestkwadrant Indische buurt                  -54.647          
##                                                      (37.625)          
##                                                                        
## BuurtZuidwestkwadrant Osdorp Noord                    -84.459          
##                                                      (68.262)          
##                                                                        
## BuurtZuidwestkwadrant Osdorp Zuid                    -72.124*          
##                                                      (41.973)          
##                                                                        
## BuurtZunderdorp                                       -33.482          
##                                                      (82.884)          
##                                                                        
## host_is_superhostf                                    31.685           
##                                                      (84.154)          
##                                                                        
## host_is_superhostt                                    23.815           
##                                                      (84.235)          
##                                                                        
## room_typePrivate room                               -32.607***         
##                                                       (4.617)          
##                                                                        
## room_typeShared room                                  -13.293          
##                                                      (25.084)          
##                                                                        
## property_typeApartment                                -14.855          
##                                                      (25.359)          
##                                                                        
## property_typeBarn                                    -102.173          
##                                                      (88.710)          
##                                                                        
## property_typeBed and breakfast                        -15.156          
##                                                      (26.986)          
##                                                                        
## property_typeBoat                                     19.512           
##                                                      (27.354)          
##                                                                        
## property_typeBoutique hotel                            2.490           
##                                                      (43.863)          
##                                                                        
## property_typeBungalow                                 -6.151           
##                                                      (67.896)          
##                                                                        
## property_typeCabin                                    -6.376           
##                                                      (52.587)          
##                                                                        
## property_typeCamper/RV                                -69.812          
##                                                      (130.124)         
##                                                                        
## property_typeCampsite                                 -47.158          
##                                                      (126.653)         
##                                                                        
## property_typeCasa particular (Cuba)                   -17.525          
##                                                      (78.901)          
##                                                                        
## property_typeCastle                                   30.529           
##                                                      (168.702)         
##                                                                        
## property_typeChalet                                   -33.467          
##                                                      (100.904)         
##                                                                        
## property_typeCondominium                              -13.566          
##                                                      (27.189)          
##                                                                        
## property_typeCottage                                  -30.833          
##                                                      (57.238)          
##                                                                        
## property_typeEarth house                              -72.598          
##                                                      (167.475)         
##                                                                        
## property_typeGuest suite                              -26.590          
##                                                      (29.070)          
##                                                                        
## property_typeGuesthouse                               -23.336          
##                                                      (37.865)          
##                                                                        
## property_typeHostel                                   -19.507          
##                                                      (92.605)          
##                                                                        
## property_typeHotel                                  238.210***         
##                                                      (73.362)          
##                                                                        
## property_typeHouse                                    -19.031          
##                                                      (26.041)          
##                                                                        
## property_typeHouseboat                                -0.206           
##                                                      (28.167)          
##                                                                        
## property_typeLighthouse                              416.346**         
##                                                      (178.880)         
##                                                                        
## property_typeLoft                                     13.029           
##                                                      (26.832)          
##                                                                        
## property_typeNature lodge                             14.917           
##                                                      (179.227)         
##                                                                        
## property_typeOther                                    -18.139          
##                                                      (35.428)          
##                                                                        
## property_typeServiced apartment                       23.628           
##                                                      (33.517)          
##                                                                        
## property_typeTent                                     -12.215          
##                                                      (187.455)         
##                                                                        
## property_typeTiny house                                9.720           
##                                                      (79.513)          
##                                                                        
## property_typeTownhouse                                -26.472          
##                                                      (26.465)          
##                                                                        
## property_typeVilla                                    -10.312          
##                                                      (41.976)          
##                                                                        
## bed_typeCouch                                         -30.815          
##                                                      (122.179)         
##                                                                        
## bed_typeFuton                                         -6.669           
##                                                      (81.656)          
##                                                                        
## bed_typePull-out Sofa                                  0.985           
##                                                      (77.276)          
##                                                                        
## bed_typeReal Bed                                      16.282           
##                                                      (75.233)          
##                                                                        
## minimum_nights                                         0.304           
##                                                       (0.200)          
##                                                                        
## dist.museum                                            0.002           
##                                                       (0.011)          
##                                                                        
## dist.supermarkets                                      0.002           
##                                                       (0.015)          
##                                                                        
## Unescowithin                                          -10.741          
##                                                      (35.152)          
##                                                                        
## dist.metro                                            -0.021           
##                                                       (0.015)          
##                                                                        
## dist.plaza                                            -0.017           
##                                                       (0.014)          
##                                                                        
## dist.nightclub                                         0.009           
##                                                       (0.013)          
##                                                                        
## dist.beach                                            -0.008           
##                                                       (0.012)          
##                                                                        
## dist.parks                                                             
##                                                                        
##                                                                        
## name.brightnot bright                                 -3.885           
##                                                       (6.404)          
##                                                                        
## name.spaciousspacious                                 7.942*           
##                                                       (4.403)          
##                                                                        
## name.luxurynot luxury                               -34.059***         
##                                                       (7.140)          
##                                                                        
## amenities.number                                      0.469**          
##                                                       (0.188)          
##                                                                        
## lagPrice                                              -0.035*          
##                                                       (0.018)          
##                                                                        
## Constant                                            -4,033.317         
##                                                    (28,094.420)        
##                                                                        
## -----------------------------------------------------------------------
## Observations                                          13,232           
## R2                                                     0.153           
## Adjusted R2                                            0.116           
## Residual Std. Error                            164.612 (df = 12687)    
## F Statistic                                 4.207*** (df = 544; 12687) 
## =======================================================================
## Note:                                       *p<0.1; **p<0.05; ***p<0.01

R Sqaure for this algorithm is lower than 0.5, which means that the regression fails to predict more than half of the listings’ prices in Amsterdam. In terms of accuracy, the algorithm doesn’t perform well enough.

ams.test.price.table <- ams.test.table.all %>%
  dplyr::select(id, month, price.Predict, each_month_price, Buurt) %>%
  mutate(AE = abs(each_month_price-price.Predict),
         APE = abs(each_month_price-price.Predict)/each_month_price)
ggplot(ams.test.price.table, aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

ggplot(ams.test.price.table%>%
         filter(APE<1.5), aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

The absolute percentage errors of price for test set have a positively skewed distribution. Most APEs are close to 0.15 and less than 7% of the APEs are higher than 1.5. Those APEs higher 10 might be caused by outliers, whose prices are usually extremely high or low.

  • MAE and MAPE by month
ams.test.table.all %>% 
  drop_na(price.AbsError)%>%
  group_by(month)%>%
  summarise(MAE=mean(price.AbsError),
            MAPE = mean(price.AbsError/each_month_price))%>%
  kable() %>% kable_styling()
month MAE MAPE
1 70.38180 0.4851576
2 69.99850 0.4803573
3 70.05682 0.4800525
4 71.33608 0.4851716
5 70.89487 0.4826700
6 71.01547 0.4829015
7 71.01024 0.4825626
8 70.91579 0.4822727
9 71.07073 0.4827872
10 70.72211 0.4811256
11 70.54772 0.4802203
12 70.60160 0.4873070
ams.test.table.all %>% 
    drop_na(price.AbsError)%>%
  group_by(month)%>%
  summarise(MAE=mean(price.AbsError),
            MAPE = mean(price.AbsError/each_month_price))%>%
ggplot(aes(month,MAPE)) + 
      geom_line(size = 1.1,colour = "#4757a2") + 
      labs(title = "MAPE by Month", 
           subtitle = "Amsterdam Airbnb price by month prediction",  
           x = "Month", y= "MAPE") +
      plotTheme()

Figure above shows that our algorithm is not that generalizable in time. The prediction for test set has highest error in December and lowest error in November. The prediction is neither accurate enough, as most MAPEs are higher than 48%.

  • MAE and MAPE by Buurt (April)
ams.test.prediction[[4]]%>%
  group_by(Buurt) %>%
  summarize(mean.MAPE = mean(price.AbsError/each_month_price, na.rm = T),
            mean.MAE = mean(price.AbsError, na.rm = T)) %>% kable() %>% kable_styling()
Buurt mean.MAPE mean.MAE
Aalsmeerwegbuurt Oost 0.8482999 79.579513
Aalsmeerwegbuurt West 0.5600389 73.681524
Alexanderplein e.o. 0.0799265 12.218879
Amstelglorie 0.2849965 71.249123
Amstelkwartier Noord 0.4716919 49.099742
Amstelveldbuurt 0.7942906 91.008419
Amsterdamse Bos 0.1334867 28.376262
Amsterdamse Poort 0.6354682 78.706971
Andreasterrein 0.7963976 66.762241
Anjeliersbuurt Noord 0.4586842 76.110797
Anjeliersbuurt Zuid 0.4579480 57.785337
Architectenbuurt 0.2833719 45.718874
Balboaplein e.o. 0.5926541 51.735064
Banne Noordoost 0.7953534 43.707293
Banne Noordwest 0.2068969 35.066160
Banne Zuidoost 0.4469958 36.459125
Banne Zuidwest 0.1587063 24.094442
Banpleinbuurt 0.7215072 84.288270
Bedrijventerrein Hamerstraat 0.2743106 37.340776
Bedrijventerrein Landlust 0.2595329 42.056001
Bedrijventerrein Schinkel 0.1204660 18.124364
Beethovenbuurt 0.7347758 101.773593
Begijnhofbuurt 0.8888703 97.085111
Belgi<U+00EB>plein e.o. 0.2195546 28.040285
Bellamybuurt Noord 0.5088203 64.597126
Bellamybuurt Zuid 0.3558700 48.095118
Bertelmanpleinbuurt 0.7582130 99.690054
Betondorp 0.2302631 46.859817
BG-terrein e.o. 0.4931558 78.759158
Bijlmermuseum Noord 0.1038183 16.971173
Bijlmermuseum Zuid 1.4721792 88.330751
Blauwe Zand 0.6842318 88.731578
Bloemenbuurt Noord 0.4583512 54.902015
Bloemenbuurt Zuid 0.3833271 63.303796
Bloemgrachtbuurt 0.3603517 60.133303
Borgerbuurt 0.4520868 72.693456
Borneo 0.5697911 77.888762
Bosleeuw 0.4316624 58.594925
Buiksloterbreek 0.5908330 90.940848
Buikslotermeer Noord 0.5250506 38.523545
Buikslotermeerplein 0.1971705 17.444029
Buitenveldert Midden Zuid 0.7014925 83.373883
Buitenveldert Oost Midden 0.3316740 61.735455
Buitenveldert Zuidoost 0.5863596 54.418563
Buitenveldert Zuidwest 0.6431069 113.784307
Burgemeester Tellegenbuurt Oost 0.8312302 88.744462
Burgemeester Tellegenbuurt West 0.5175902 65.932065
Burgwallen Oost 0.6324443 124.664085
Buurt 2 0.4738365 52.628819
Buurt 3 0.3915345 48.559641
Buurt 4 Oost 0.8100930 104.535326
Buurt 5 Noord 2.0632156 72.571100
Buurt 5 Zuid 1.1169185 93.209349
Buurt 6 0.2406804 71.722746
Buurt 7 0.5235892 62.307113
Buurt 8 0.3669519 60.828894
Buurt 9 0.1850751 36.829945
Buyskade e.o. 0.4318008 61.477978
Calandlaan/Lelylaan 0.9917287 59.920757
Columbusplein e.o. 0.6254155 63.934976
Concertgebouwbuurt 0.4735893 67.967516
Cornelis Schuytbuurt 0.5790923 113.141056
Cornelis Troostbuurt 0.3969562 59.774625
Cremerbuurt Oost 0.3302053 59.702585
Cremerbuurt West 0.4878790 57.049963
Czaar Peterbuurt 0.3504581 46.833566
D-buurt 0.6769604 270.784176
Da Costabuurt Noord 0.5006239 83.813219
Da Costabuurt Zuid 0.4270985 50.522190
Dapperbuurt Noord 0.2907810 50.664405
Dapperbuurt Zuid 0.5540819 81.721536
De Aker Oost 0.2619664 40.816218
De Aker West 0.4847173 51.996913
De Bongerd 0.6171991 60.954868
De Eenhoorn 0.5492506 66.388288
De Kleine Wereld 0.3985720 85.664065
De Klenckebuurt 0.9342866 70.071495
De Omval 0.3145614 114.391229
De Punt 0.2417326 26.973728
De Wester Quartier 0.4425673 79.237986
De Wetbuurt 0.4491094 55.644755
De Wittenbuurt Noord 0.4543091 84.574308
De Wittenbuurt Zuid 0.3591843 87.693271
Delflandpleinbuurt Oost 0.4538331 36.697761
Delflandpleinbuurt West 0.2130858 36.328263
Den Texbuurt 0.5963689 112.729259
Diamantbuurt 0.7306635 73.548507
Diepenbrockbuurt 0.3696382 65.094564
Don Bosco 0.4472811 52.971588
Dorp Sloten 1.0370500 97.714523
Driehoekbuurt 0.4568119 51.140963
Duivelseiland 0.4856961 79.323892
Durgerdam 0.6055265 47.145776
E-buurt 0.3245683 118.352711
Ecowijk 0.4415434 54.110907
Elandsgrachtbuurt 0.4887473 86.326062
Elzenhagen Noord 0.7784932 61.339132
Emanuel van Meterenbuurt 0.7411112 70.028660
Entrepot-Noordwest 0.2624291 45.352113
Erasmusparkbuurt Oost 1.2002243 124.544878
Erasmusparkbuurt West 0.4805604 68.561989
F-buurt 0.3331667 93.866256
Fannius Scholtenbuurt 0.3749251 44.453507
Felix Meritisbuurt 0.3778189 63.279907
Filips van Almondekwartier 0.2922553 56.382785
Frankendael 0.8106249 113.097300
Frans Halsbuurt 0.4691876 61.681336
Frederik Hendrikbuurt Noord 0.4587901 66.475724
Frederik Hendrikbuurt Zuidoost 0.3511195 76.532073
Frederik Hendrikbuurt Zuidwest 0.3379898 63.883093
Frederikspleinbuurt 1.1527539 116.575727
G-buurt Noord 0.6484979 70.794349
G-buurt Oost 1.2639613 151.702362
G-buurt West 0.9167907 90.964192
Gaasperdam Noord 0.3872064 56.636388
Gaasperpark 0.1515962 13.997378
Gaasperplas 0.1340792 26.876783
Gein Noordoost 1.1302352 75.598301
Gein Noordwest 0.2536248 22.018684
Gein Zuioost 0.3455533 18.385441
Gelderlandpleinbuurt 0.4478324 100.145434
Gerard Doubuurt 0.4537865 65.958462
Geuzenhofbuurt 0.4222017 61.644382
Gibraltarbuurt 0.4317405 58.729588
Gouden Bocht 0.2361294 52.235650
Groenmarktkadebuurt 0.5715233 52.246039
Grunder/Koningshoef 0.5257551 262.877527
Haarlemmerbuurt Oost 0.8266043 154.160345
Haarlemmerbuurt West 0.4043421 69.869784
Hakfort/Huigenbos 0.2441623 32.620247
Harmoniehofbuurt 0.2353300 43.145657
Haveneiland Noord 1.5863567 124.485870
Haveneiland Noordoost 0.3128995 48.176464
Haveneiland Noordwest 0.4092326 70.127878
Haveneiland Oost 0.2933539 69.587172
Haveneiland Zuidwest/Rieteiland West 0.7364491 106.982657
Helmersbuurt Oost 0.4618901 79.713689
Hemelrijk 0.4317231 96.111283
Hemonybuurt 0.4423449 139.205114
Hercules Seghersbuurt 0.7195793 77.263532
Het Funen 0.2962107 41.026052
Holendrecht Oost 0.9216839 174.437212
Hondecoeterbuurt 0.6042666 69.750940
Hoptille 0.7155618 241.240931
Houthavens Oost 0.2478251 28.643737
Houthavens West 0.5580632 131.384067
IJplein e.o. 0.7746175 64.807206
IJsbaanpad e.o. 0.2944515 58.405035
IJselbuurt Oost 0.4040523 59.208971
IJselbuurt West 0.2250278 48.056648
Jacob Geelbuurt 0.2435287 60.882181
Jacques Veldmanbuurt 0.6149482 80.942563
Jan Maijenbuurt 0.5445055 67.156018
Java-eiland 0.6497427 155.364872
Johan Jongkindbuurt 0.4574941 49.439861
Johannnes Vermeerbuurt 0.5919263 67.143663
John Franklinbuurt 0.3455264 61.693945
Julianapark 0.7111052 52.277001
K-buurt Zuidoost 0.1404654 12.314133
Kadijken 0.6919844 78.682644
Kadoelen 0.5597032 68.070761
Kalverdriehoek 0.4857547 92.341819
Kantershof 0.2474689 20.330889
Kattenburg 0.2829312 46.169090
Kazernebuurt 0.3308971 111.184260
KNSM-eiland 0.3957661 60.799917
Kolenkitbuurt Noord 0.3379981 77.143240
Kolenkitbuurt Zuid 0.5293842 64.627318
Koningin Wilhelminaplein 0.3467333 77.314123
Kop Zeedijk 0.7679214 84.459160
Kop Zuidas 0.5471934 180.726880
Kortenaerkwartier 0.4395164 85.457909
Kortvoort 0.3171244 31.078190
Kromme Mijdrechtbuurt 0.4014451 55.302712
L-buurt 0.2862584 37.588169
Laan van Spartaan 0.4326749 55.661925
Landlust Noord 0.4031663 50.468995
Landlust Zuid 0.4596964 55.699365
Langestraat e.o. 0.5717155 98.583127
Lastage 0.3207628 93.175090
Legmeerpleinbuurt 1.1525055 205.971666
Leidsebuurt Noordoost 0.5124568 86.971748
Leidsebuurt Noordwest 0.6009920 54.444048
Leidsebuurt Zuidoost 0.4765612 83.620457
Leidsebuurt Zuidwest 0.3178246 72.987227
Leidsegracht Noord 0.4067210 72.323268
Leidsegracht Zuid 0.4505997 49.494991
Leliegracht e.o. 0.4090148 61.579954
Linnaeusparkbuurt 0.6756746 63.628615
Lizzy Ansinghbuurt 0.4101128 62.669200
Loenermark 0.0204618 3.620532
Lootsbuurt 0.4705289 314.834424
Louis Crispijnbuurt 0.4821637 103.663073
Lucas/Andreasziekenhuis e.o. 0.3439832 117.529657
Marathonbuurt Oost 0.5796580 75.706013
Marathonbuurt West 0.4207985 52.653848
Marcanti 0.3545225 46.444569
Marine-Etablissement 0.3776606 126.622811
Markengouw Midden 0.4168370 150.972435
Markthallen 0.3960551 53.846668
Marnixbuurt Midden 0.3352077 82.333830
Marnixbuurt Noord 0.3809387 53.775860
Marnixbuurt Zuid 0.8561556 136.417082
Meer en Oever 0.1338822 10.752959
Mercatorpark 0.2872377 36.743469
Middelveldsche Akerpolder 0.4887032 25.773783
Middenmeer Noord 0.3671617 83.669745
Middenmeer Zuid 0.5353730 60.452547
Minervabuurt Midden 0.8913877 115.931465
Minervabuurt Noord 0.7229124 85.152927
Minervabuurt Zuid 0.7734265 140.297673
Molenwijk 0.1747834 29.713184
Museumplein 0.0183529 2.165643
NDSM terrein 2.2758186 552.898138
Nes e.o. 0.4708715 60.214050
Nieuw Sloten Noordoost 0.3521263 59.569457
Nieuw Sloten Noordwest 1.1000836 93.537092
Nieuw Sloten Zuidoost 0.3597568 33.285712
Nieuw Sloten Zuidwest 0.6293183 56.009331
Nieuwe Kerk e.o. 0.3604292 58.210499
Nieuwendammerdijk Oost 0.2469759 45.954117
Nieuwendammmerdijk West 0.3336743 62.211233
Nieuwendijk Noord 0.4296475 55.746268
Nieuwmarkt 0.8617167 127.071381
Noordoever Sloterplas 0.3170490 52.520363
Noordoostkwadrant Indische buurt 0.3781155 46.717299
Noordwestkwadrant Indische buurt Noord 0.4542709 51.977908
Noordwestkwadrant Indische buurt Zuid 0.4431618 82.774067
Olympisch Stadion e.o. 0.1692985 20.432150
Ookmeer 1.3271658 53.086633
Oostelijke Handelskade 0.2350411 33.784755
Oostenburg 0.4914992 70.935988
Oosterpark 0.5544603 50.825482
Oosterparkbuurt Noordwest 0.4262980 46.724698
Oosterparkbuurt Zuidoost 0.5386380 63.069634
Oosterparkbuurt Zuidwest 0.5681674 66.860011
Oostoever Sloterplas 0.6189193 80.683528
Oostpoort 0.2530412 36.154703
Oostzanerdijk 0.5860497 58.575392
Orteliusbuurt Midden 0.3374337 65.416746
Orteliusbuurt Noord 0.3548012 55.990240
Orteliusbuurt Zuid 0.4936860 67.559934
Osdorp Midden Noord 0.3853096 45.668149
Osdorp Midden Zuid 1.2103963 105.292117
Osdorp Zuidoost 0.5783300 43.569331
Osdorper Binnenpolder 0.0101763 1.690962
Osdorpplein e.o. 0.1758429 22.824592
Oude Kerk e.o. 0.3575976 70.234766
Overhoeks 0.7858372 196.459302
Overtoomse Veld Noord 0.6379286 111.874307
Overtoomse Veld Zuid 0.2386703 27.142563
P.C. Hooftbuurt 0.4664210 106.138925
Papaverweg e.o. 0.5339062 127.451147
Paramariboplein e.o. 0.4478085 62.949053
Park de Meer 0.1239049 28.022190
Parooldriehoek 0.3859117 34.765705
Passeerdersgrachtbuurt 0.8228697 109.032316
Pieter van der Doesbuurt 0.3782911 117.747709
Plan van Gool 0.4913810 58.311358
Planciusbuurt Noord 2.8902092 289.020922
Plantage 0.6343966 80.655213
Postjeskade e.o. 0.4419432 53.149338
Prinses Irenebuurt 0.7170671 93.014561
RAI 0.3241991 43.841165
Ransdorp 0.6508467 65.735515
Rapenburg 0.2795397 41.633865
Rechte H-buurt 0.9461994 47.309972
Reguliersbuurt 0.5647622 114.614390
Reigersbos Midden 0.6220575 79.687605
Reigersbos Noord 0.2494638 29.436723
Rembrandtpark Noord 0.2279013 20.119756
Rembrandtpark Zuid 0.3756689 58.815867
Rembrandtpleinbuurt 0.5770631 83.680500
RI Oost terrein 0.3942268 76.361052
Rietlanden 1.1021497 84.794203
Rijnbuurt Midden 0.3535228 46.671466
Rijnbuurt Oost 0.4668837 53.972267
Rijnbuurt West 0.5145902 73.108918
Robert Scottbuurt Oost 0.3875369 47.877897
Robert Scottbuurt West 0.4228762 49.869398
Rode Kruisbuurt 0.5042654 130.604741
Sarphatiparkbuurt 0.5582229 81.440191
Sarphatistrook 0.5018880 79.881132
Scheepvaarthuisbuurt 0.3099189 75.401826
Scheldebuurt Midden 0.6285853 57.485704
Scheldebuurt Oost 0.4499047 85.805308
Scheldebuurt West 0.4864030 62.244387
Schellingwoude Oost 0.4425946 59.332013
Schellingwoude West 0.4969251 55.962510
Schinkelbuurt Noord 0.3246526 51.525083
Schinkelbuurt Zuid 0.3732498 82.881073
Science Park Noord 0.6192178 112.995634
Science Park Zuid 0.4399238 84.025446
Slotermeer Zuid 1.2621779 82.001443
Sloterweg e.o. 0.6498752 227.456317
Spaarndammerbuurt Midden 0.3160943 33.339270
Spaarndammerbuurt Noordoost 0.3279943 37.079530
Spaarndammerbuurt Noordwest 1.0297800 115.245887
Spaarndammerbuurt Zuidoost 0.3871032 79.184414
Spaarndammerbuurt Zuidwest 0.3195916 49.860548
Spiegelbuurt 0.2792729 55.409908
Sporenburg 0.4414433 53.534428
Spuistraat Noord 0.4640802 61.553048
Spuistraat Zuid 0.7098804 131.061108
Staalmanbuurt 0.4505089 90.206901
Staatsliedenbuurt Noordoost 0.7669482 67.225401
Steigereiland Noord 0.2776520 40.480090
Steigereiland Zuid 0.6908766 61.601743
Surinamepleinbuurt 0.7000518 88.760202
Swammerdambuurt 0.2685603 37.035588
Terrasdorp 0.5857807 99.179897
Transvaalbuurt Oost 0.5223970 63.690756
Transvaalbuurt West 0.3684852 47.644309
Trompbuurt 0.3605183 56.770372
Tuindorp Frankendael 0.7056819 41.040293
Tuindorp Nieuwendam Oost 0.3927176 86.976683
Tuindorp Nieuwendam West 0.3615555 55.181241
Tuindorp Oostzaan Oost 0.4014273 80.419388
Tuindorp Oostzaan West 1.0461332 146.458653
Twiske West 0.5448704 58.659654
Uilenburg 0.4159883 114.104819
Utrechtsebuurt Zuid 0.4662460 79.835607
Valeriusbuurt Oost 0.5591964 111.103936
Valeriusbuurt West 0.6877767 87.974913
Valkenburg 0.3669002 59.565454
Van Brakelkwartier 0.3390716 46.385607
Van der Helstpleinbuurt 0.5036071 79.075316
Van der Pekbuurt 0.3776711 68.033958
Van Loonbuurt 0.6861531 91.968889
Van Tuyllbuurt 0.6991448 71.507267
Velserpolder West 0.8826984 85.985805
Veluwebuurt 0.8286163 43.745202
Venserpolder Oost 0.6134227 179.403041
Vliegenbos 0.5818375 182.981681
Vogelbuurt Noord 0.3149512 61.686386
Vogelbuurt Zuid 0.5934294 96.220170
Vondelpark West 0.3750202 73.658552
Vondelparkbuurt Midden 0.4511767 78.040057
Vondelparkbuurt Oost 0.3997721 60.698904
Vondelparkbuurt West 0.4052725 62.194751
VU-kwartier 0.2805235 40.440631
Walvisbuurt 0.2959902 17.759410
Waterloopleinbuurt 0.3216620 48.344516
Weesperbuurt 0.4317617 73.956656
Weespertrekvaart 0.3828840 114.087231
Weesperzijde Midden/Zuid 0.4406814 63.144185
Werengouw Midden 0.3985734 45.940976
Werengouw Zuid 1.1130987 100.951415
Westelijke eilanden 0.6613661 86.072107
Westerdokseiland 0.3847410 74.119086
Westergasfabriek 0.4198981 62.299820
Westerstaatsman 0.4444843 55.513248
Westlandgrachtbuurt 0.3919547 52.376319
Weteringbuurt 0.5481243 91.275984
WG-terrein 0.5633897 84.782358
Wielingenbuurt 0.3816397 62.559760
Wildeman 2.3677661 148.591874
Willemsparkbuurt Noord 0.4455301 96.628472
Willibrordusbuurt 0.3994362 52.051667
Wittenburg 0.4384509 47.696502
Woon- en Groengebied Sloterdijk 0.6242589 59.292091
Zaagpoortbuurt 0.4231808 85.207364
Zeeburgereiland Zuidwest 0.3232985 29.116638
Zeeheldenbuurt 0.3617490 69.435192
Zuidas Zuid 1.2330960 123.472047
Zuiderkerkbuurt 0.5601882 121.844265
Zuidoostkwadrant Indische buurt 0.2654702 51.187997
Zuidwestkwadrant Indische buurt 0.4932241 51.022590
Zuidwestkwadrant Osdorp Noord 0.2924177 58.483548
Zuidwestkwadrant Osdorp Zuid 0.4055739 44.095377
Zunderdorp 0.2000008 26.373806
ams.test.prediction[[4]]%>%
  group_by(Buurt) %>%
  summarize(mean.MAPE = mean(price.AbsError/each_month_price, na.rm = T),
            mean.MAE = mean(price.AbsError, na.rm = T)) %>%
  ungroup() %>% 
  left_join(neighbor2,by = "Buurt") %>%
    st_sf() %>%
    ggplot() + 
      geom_sf(aes(fill = mean.MAPE),colour = 'transparent') +
      scale_fill_gradient(low = palette5[1], high = palette5[5],
                          name = "MAPE") +
      labs(title = "Mean test set MAPE by Buurt",
           subtitle = "April, 2018") +
      mapTheme

Figure and table above shows that our model is generalizable across space. Its performance in accuracy is not that good as generalizability. That’s probably because we take many spatial features into consideration but miss some key points like time effect due to lack of data.

5.1.3 Occupancy Regression

occupancy3 <- merge(occupancy3, listing.sf.neighbor2[c("id", "Buurt")], by = "id")

set.seed(5164)

month.var <- c(1:12)

Occupancy.monthList <- list()
ams.train <- list()
ams.test <- list()
ams.test.prediction <- list()
ams.test.table <- list()

Jan_occupancy <- st_drop_geometry(price_panel_lag)%>%
  filter(month == 1)%>%
  mutate(bathrooms = as.numeric(bathrooms),
         bedrooms = as.numeric(bedrooms))

inTrain <- createDataPartition(
              y = paste(Jan_occupancy$pool,Jan_occupancy$Buurt,Jan_occupancy$property_type,
                        Jan_occupancy$host_is_superhost), 
              p = .60, list = FALSE)

for (i in month.var){
Occupancy.monthList[[i]] <- 
  st_drop_geometry(price_panel_lag) %>% 
  mutate(bathrooms = as.numeric(bathrooms),
         bedrooms = as.numeric(bedrooms) )%>%
  filter(month == i) 

ams.train[[i]] <- Occupancy.monthList[[i]][inTrain,] 
ams.test[[i]]  <- Occupancy.monthList[[i]][-inTrain,]

reg.occupancy <- lm(monthly_occupancy ~ ., 
                data = ams.train[[i]] %>% 
                dplyr::select(monthly_occupancy, beds, bedrooms, bathrooms, accommodates,
                              pool, parking, kitchen, AC, fireplace,
                              Buurt,host_is_superhost,
                              room_type,property_type,bed_type,
                              minimum_nights,dist.museum,dist.supermarkets,
                              Unesco,dist.metro,dist.plaza, dist.nightclub,
                              dist.beach, dist.parks,
                              name.bright, name.spacious,name.luxury,
                              amenities.number))

ams.test.prediction[[i]] <-
  ams.test[[i]] %>%
  mutate(occupancy.Predict =  predict(reg.occupancy, ams.test[[i]]),
         occupancy.AbsError = abs(monthly_occupancy - occupancy.Predict))

if(i ==1){
    ams.test.table.all <- ams.test.prediction[[i]]
    }else{
      ams.test.table[[i]] <- ams.test.prediction[[i]]
      ams.test.table.all <- rbind(ams.test.table.all,ams.test.table[[i]])
    }
      
}
ams.test.occupancy.table <- ams.test.table.all %>%
  dplyr::select(id, month, occupancy.Predict, monthly_occupancy, Buurt) %>%
  mutate(AE = abs(monthly_occupancy-occupancy.Predict),
         APE = abs(monthly_occupancy-occupancy.Predict)/monthly_occupancy)
ggplot(ams.test.occupancy.table, aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

ggplot(ams.test.occupancy.table %>%
        filter(APE<1.5),
         aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

The absolute percentage errors of occupancy for test set have a positively skewed distribution. Most APEs are close to 0.15 and less than 6% of tje APEs are higher than 1.5. Those APEs higher 10 might be caused by outliers, whose occupancy are usually low (e.g. 0 per month).

5.1.4 Calculate anuual revenue

ams.test.revenue.table <-
  merge(ams.test.occupancy.table[c("id","month","occupancy.Predict","monthly_occupancy","Buurt")],
        ams.test.price.table[c("id","month","price.Predict","each_month_price")],by=c("id","month")) %>%
  mutate(revenue = monthly_occupancy * each_month_price,
         predictRevenue = occupancy.Predict*price.Predict) %>%
  group_by(id, Buurt)%>%
  summarise(annualRevenue = sum(revenue),
            predictAnnualRev = sum(predictRevenue))%>%
  mutate(AE= abs(annualRevenue-predictAnnualRev),
         APE = abs(annualRevenue-predictAnnualRev)/annualRevenue)


ggplot(ams.test.revenue.table,
         aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

ggplot(ams.test.revenue.table %>%
        filter(APE<1.5),
         aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

Most annual revenues predicted by our first approach have an APE close to 0.15, which is a sign to accurate prediction. However, there are still some predicted revenues that have APE higher than 1, which may cause problems in our use case.

  • plot error by neighborhood2
ams.test.revenue.table%>%
  filter(APE<1.5)%>%
  group_by(Buurt) %>%
  summarize(mean.APE = mean(APE, na.rm = T)) %>%
  ungroup() %>% 
  left_join(neighbor2,by = "Buurt") %>%
    st_sf() %>%
    ggplot() + 
      geom_sf(aes(fill = mean.APE),colour = 'transparent') +
      scale_fill_gradient(low = palette5[1], high = palette5[5],
                          name = "MAPE") +
      labs(title = "Mean test set MAPE by Buurt",
           subtitle = "2019") +
      mapTheme

High MAPE of prediction exists at the outskirt of Amsterdam. Far away from the city center, those listing at the outskirt are seldom occupied by renter, since population density is usually lower at the outskirt. The following analysis also proves our speculation, indicating that the listings with high MAPEs are mainly vacant throughout the year and thus have little revenue.

revenue_panel <- merge(revenue_panel, listing.sf.neighbor2[c("id", "Buurt")], by = "id")

revenue_panel%>%
  group_by(Buurt) %>%
  summarize(occupancy = mean(monthly_occupancy, na.rm = T)) %>%
  ungroup() %>% 
  left_join(neighbor2,by = "Buurt") %>%
    st_sf() %>%
    ggplot() + 
      geom_sf(aes(fill = occupancy),colour = 'transparent') +
      scale_fill_gradient(low = palette5[1], high = palette5[5],
                          name = "occupancy") +
      labs(title = "Occupancy by Buurt",
           subtitle = "2019") +
      mapTheme

Those areas with low occupancy are almost the same as those with high MAPE.

5.2 Predict Annual Revenue

5.2.1 Revenue

annualrevenue <- revenue_panel %>%
  group_by(id) %>%
  summarise(annual_revenue = sum(revenue))

annualrevenue<- left_join(details.sf, annualrevenue,by="id")%>%
    filter(!id %in% no_price)

5.2.2 Annual Revenue Spatial Lag

annualrevenue<- annualrevenue %>%
  drop_na(annual_revenue)

coords <- st_coordinates(annualrevenue) 

neighborList <- knn2nb(knearneigh(coords, 5))

spatialWeights <- nb2listw(neighborList, style="W")



annualrevenue$lagRevenue <- lag.listw(spatialWeights, annualrevenue$annual_revenue)
ggplot(annualrevenue)+
  geom_point(aes(x = lagRevenue, y = annual_revenue), alpha = 0.26)+
  geom_smooth(aes(x = lagRevenue, y =annual_revenue), method = "lm", se= FALSE, color = "orange")+
  labs(title="Revenue as a function of lagRevenue",
      caption = "Figure xx. Scatterplots of revenue and lagRevenue")+
  plotTheme()

From the figure above, we know that though the annual revenue has correlation with lag annual revenue, their correlation is not that strong. Obviously, some listings with high annual revenues are surrounded by houses with much lower annual revenue. For these listings, lag annual revenue might be a misleading predictor. If we ignore these data, we will find that most listings’ annual revenues are similar to the nearby.

5.2.2 Anuual revenue regression

annualrevenue <- merge(annualrevenue,listing.sf.neighbor2[c("id", "Buurt")], by = "id")

#Split training and test set
set.seed(31497)

inTrain <- caret::createDataPartition(
  y = st_drop_geometry(annualrevenue)$annual_revenue, 
  p = .6, list = FALSE)

annualrevenue.training <- st_drop_geometry(annualrevenue)[inTrain,]
annualrevenue.test     <- st_drop_geometry(annualrevenue)[-inTrain,]


reg.annualrevenue <- lm(annual_revenue ~ ., data = st_drop_geometry(annualrevenue) %>% 
             dplyr::select(annual_revenue,beds, bedrooms, bathrooms, accommodates,
                              pool, parking, kitchen, AC, fireplace,
                              Buurt,host_is_superhost,
                              room_type,property_type,bed_type,
                              minimum_nights,dist.museum,dist.supermarkets,
                              Unesco,dist.metro,dist.plaza, dist.nightclub,
                              dist.beach, dist.parks,
                              name.bright, name.spacious,name.luxury,
                              amenities.number,lagRevenue)
) 

annualrev_predict_test <- annualrevenue.test %>%
  mutate(Prediction = predict(reg.annualrevenue, newdata = annualrevenue.test)) %>%
  mutate(Prediction = ifelse(Prediction > 0, Prediction, mean(annualrevenue.training$annual_revenue)))%>%
  filter(annual_revenue!=0)%>%
  drop_na(Prediction)%>%
  mutate(AE = abs(Prediction-annual_revenue),
         APE = AE/Prediction)

test_result <- data.frame(MAE = c(mean(annualrev_predict_test$AE, na.rm=T)),
              MAPE = c(scales::percent(mean(annualrev_predict_test$APE, na.rm=T)))) 

test_result %>%
  kable(caption = "Figure 8. Mean absolute error and MAPE for a single test set")%>%
  kable_styling("striped", full_width = F)
Figure 8. Mean absolute error and MAPE for a single test set
MAE MAPE
25161.32 49%
stargazer(reg.annualrevenue,
          type = "text",
          title ="Regression Output",
          single.row = TRUE,
          out.header = TRUE)
## 
## Regression Output
## =======================================================================
##                                                 Dependent variable:    
##                                             ---------------------------
##                                                   annual_revenue       
## -----------------------------------------------------------------------
## beds                                          -2,671.288*** (641.807)  
## bedrooms0                                     7,342.858 (21,428.930)   
## bedrooms1                                     9,963.485 (21,362.120)   
## bedrooms10                                   72,715.150* (38,608.570)  
## bedrooms11                                    49,066.300 (63,071.530)  
## bedrooms12                                    15,007.930 (50,473.470)  
## bedrooms2                                     16,064.770 (21,396.530)  
## bedrooms3                                     20,739.420 (21,467.850)  
## bedrooms4                                    38,146.470* (21,657.710)  
## bedrooms5                                     36,992.390 (22,902.180)  
## bedrooms6                                    57,290.620** (26,905.420) 
## bedrooms7                                     15,018.830 (39,087.540)  
## bedrooms8                                     20,379.390 (36,037.210)  
## bedrooms9                                     66,234.090 (62,880.770)  
## bathrooms0.0                                  3,294.723 (22,069.200)   
## bathrooms0.5                                  6,052.833 (19,028.410)   
## bathrooms1.0                                  10,489.170 (17,916.220)  
## bathrooms1.5                                  14,352.940 (17,955.390)  
## bathrooms10.0                                 37,882.310 (60,834.130)  
## bathrooms100.5                                17,725.990 (59,498.610)  
## bathrooms15.0                                 -7,965.049 (58,857.190)  
## bathrooms2.0                                  17,233.890 (18,032.780)  
## bathrooms2.5                                  15,986.710 (18,425.190)  
## bathrooms3.0                                  16,752.270 (19,127.040)  
## bathrooms3.5                                 48,464.670** (21,739.600) 
## bathrooms4.0                                  14,764.660 (26,643.960)  
## bathrooms4.5                                  70,751.860 (60,349.580)  
## bathrooms5.0                                   -148.185 (59,469.590)   
## bathrooms5.5                                  63,896.570 (59,879.500)  
## bathrooms7.0                                 -13,590.590 (58,842.490)  
## bathrooms8.0                                  21,592.720 (44,920.160)  
## accommodates10                               39,520.710** (16,124.190) 
## accommodates11                               -12,999.070 (33,845.480)  
## accommodates12                                1,466.921 (15,123.930)   
## accommodates14                                46,613.710 (29,998.690)  
## accommodates16                              106,032.300*** (18,462.190)
## accommodates17                               100,911.400 (84,571.250)  
## accommodates2                                  2,820.207 (2,952.888)   
## accommodates3                                  3,662.260 (3,310.069)   
## accommodates4                                 8,073.109** (3,271.736)  
## accommodates5                                13,417.040*** (4,527.325) 
## accommodates6                                17,891.620*** (4,547.433) 
## accommodates7                                  9,468.115 (8,760.905)   
## accommodates8                                24,053.570*** (7,336.879) 
## accommodates9                                 1,914.108 (26,627.430)   
## poolPool                                      -2,777.525 (6,509.817)   
## parkingParking                               -2,874.765*** (1,070.198) 
## kitchenNo kitchen                            -8,183.811*** (1,703.067) 
## ACNo AC                                        -261.767 (2,032.951)    
## fireplaceNo Fireplace                         -1,195.410 (1,859.437)   
## BuurtAalsmeerwegbuurt West                    -10,523.100 (7,804.907)  
## BuurtAlexanderplein e.o.                     -24,882.970 (19,954.410)  
## BuurtAmstel III deel A/B Noord                 -430.237 (62,233.020)   
## BuurtAmstelglorie                             11,343.320 (20,827.930)  
## BuurtAmstelkwartier Noord                    -10,172.430 (12,588.370)  
## BuurtAmstelkwartier West                     -14,576.170 (27,371.590)  
## BuurtAmstelkwartier Zuid                      1,601.040 (57,331.830)   
## BuurtAmstelpark                               -5,149.907 (57,959.950)  
## BuurtAmstelveldbuurt                          1,681.415 (13,232.570)   
## BuurtAmsterdamse Bos                        -77,124.290*** (19,345.770)
## BuurtAmsterdamse Poort                        -3,501.498 (26,931.050)  
## BuurtAndreasterrein                          -23,813.090 (17,137.610)  
## BuurtAnjeliersbuurt Noord                     -4,074.862 (15,723.160)  
## BuurtAnjeliersbuurt Zuid                      -3,114.607 (15,017.990)  
## BuurtArchitectenbuurt                        -17,115.000 (13,777.920)  
## BuurtBalboaplein e.o.                        -11,599.170 (10,767.450)  
## BuurtBanne Noordoost                          -7,766.115 (24,985.290)  
## BuurtBanne Noordwest                          2,616.713 (26,395.470)   
## BuurtBanne Zuidoost                           -6,753.600 (20,643.230)  
## BuurtBanne Zuidwest                          -19,005.510 (22,106.820)  
## BuurtBanpleinbuurt                           -25,224.880 (15,873.240)  
## BuurtBedrijvencentrum Osdorp                 -23,106.150 (56,697.970)  
## BuurtBedrijvencentrum Westerkwartier         -21,517.960 (26,471.590)  
## BuurtBedrijvengebied Cruquiusweg              -8,081.281 (29,446.040)  
## BuurtBedrijvengebied Veelaan                 -29,944.570 (56,501.610)  
## BuurtBedrijvengebied Zeeburgerkade           -50,772.130 (33,349.320)  
## BuurtBedrijvenpark Lutkemeer                 -43,575.480 (57,533.970)  
## BuurtBedrijventerrein Hamerstraat            -15,853.680 (21,425.930)  
## BuurtBedrijventerrein Landlust                -8,897.556 (16,318.010)  
## BuurtBedrijventerrein Schinkel               -20,785.030* (12,552.680) 
## BuurtBeethovenbuurt                          -22,603.440 (16,221.330)  
## BuurtBegijnhofbuurt                            225.913 (16,650.810)    
## BuurtBelgi< U+00EB> plein e.o.               -22,397.800 (26,444.780)  
## BuurtBellamybuurt Noord                        -909.526 (10,198.140)   
## BuurtBellamybuurt Zuid                        -2,249.562 (9,440.483)   
## BuurtBertelmanpleinbuurt                      1,137.986 (14,916.220)   
## BuurtBetondorp                               -11,089.760 (16,450.140)  
## BuurtBG-terrein e.o.                          9,704.575 (15,087.640)   
## BuurtBijlmermuseum Noord                     -10,763.380 (32,220.670)  
## BuurtBijlmermuseum Zuid                      -43,173.190 (32,909.930)  
## BuurtBijlmerpark Oost                         66,042.360 (45,895.390)  
## BuurtBlauwe Zand                             -12,311.680 (16,232.650)  
## BuurtBloemenbuurt Noord                       3,125.184 (17,826.210)   
## BuurtBloemenbuurt Zuid                        -7,305.736 (17,238.430)  
## BuurtBloemgrachtbuurt                         -4,484.730 (14,294.550)  
## BuurtBorgerbuurt                               -924.264 (9,552.943)    
## BuurtBorneo                                  -15,317.980 (11,267.590)  
## BuurtBosleeuw                                -14,456.500 (13,134.320)  
## BuurtBretten Oost                              -831.983 (58,193.890)   
## BuurtBuiksloterbreek                          -2,970.403 (26,008.490)  
## BuurtBuiksloterdijk West                     -23,531.560 (29,879.120)  
## BuurtBuiksloterham                           -13,796.630 (27,002.440)  
## BuurtBuikslotermeer Noord                     -8,379.752 (24,018.040)  
## BuurtBuikslotermeerplein                     -29,687.010 (21,334.120)  
## BuurtBuitenveldert Midden Zuid               -21,609.620 (14,184.760)  
## BuurtBuitenveldert Oost Midden               -17,388.410 (15,745.550)  
## BuurtBuitenveldert West Midden                15,215.720 (27,008.260)  
## BuurtBuitenveldert Zuidoost                  -12,245.850 (15,909.460)  
## BuurtBuitenveldert Zuidwest                  -10,677.840 (14,122.150)  
## BuurtBurgemeester Tellegenbuurt Oost         -14,126.250 (10,373.570)  
## BuurtBurgemeester Tellegenbuurt West         -11,761.750 (11,131.620)  
## BuurtBurgwallen Oost                          -7,124.753 (14,545.990)  
## BuurtBuurt 10                                -49,272.390* (27,028.510) 
## BuurtBuurt 2                                  -3,208.065 (19,471.730)  
## BuurtBuurt 3                                 -19,373.720 (15,908.220)  
## BuurtBuurt 4 Oost                            -24,511.840 (20,953.600)  
## BuurtBuurt 5 Noord                           -20,752.410 (26,412.340)  
## BuurtBuurt 5 Zuid                            -32,229.330 (21,084.430)  
## BuurtBuurt 6                                 -27,135.340 (31,046.490)  
## BuurtBuurt 7                                 -34,281.930 (28,429.700)  
## BuurtBuurt 8                                 -16,156.210 (21,958.550)  
## BuurtBuurt 9                                 -16,964.860 (35,541.790)  
## BuurtBuyskade e.o.                           -12,279.100 (12,600.530)  
## BuurtCalandlaan/Lelylaan                     -33,808.420* (20,197.160) 
## BuurtCentrumeiland                           -25,793.050 (59,284.920)  
## BuurtCircus/Kermisbuurt                      -13,058.690 (40,757.170)  
## BuurtCoenhaven/Mercuriushaven                -34,416.310 (58,802.010)  
## BuurtColumbusplein e.o.                       -2,952.314 (9,857.044)   
## BuurtConcertgebouwbuurt                       -9,094.406 (11,075.090)  
## BuurtCornelis Douwesterrein                  -20,026.060 (44,903.360)  
## BuurtCornelis Schuytbuurt                      3,470.345 (9,438.897)   
## BuurtCornelis Troostbuurt                     -6,256.948 (9,171.211)   
## BuurtCremerbuurt Oost                         -3,295.712 (8,494.607)   
## BuurtCremerbuurt West                         -8,629.172 (7,667.895)   
## BuurtCzaar Peterbuurt                         -7,497.579 (11,548.520)  
## BuurtD-buurt                                  26,579.750 (28,482.170)  
## BuurtDa Costabuurt Noord                      -7,799.582 (9,380.354)   
## BuurtDa Costabuurt Zuid                       -6,725.658 (9,521.668)   
## BuurtDapperbuurt Noord                        -10,995.240 (8,415.776)  
## BuurtDapperbuurt Zuid                         -10,841.580 (8,662.347)  
## BuurtDe Aker Oost                            -21,635.680* (11,364.590) 
## BuurtDe Aker West                            -24,160.990 (17,594.350)  
## BuurtDe Bongerd                              -13,595.790 (21,343.190)  
## BuurtDe Eenhoorn                             -12,343.710 (14,937.750)  
## BuurtDe Kleine Wereld                        -14,651.340 (25,346.550)  
## BuurtDe Klenckebuurt                         -36,650.800 (29,979.850)  
## BuurtDe Omval                                 -3,912.905 (19,788.380)  
## BuurtDe Punt                                 -27,513.780* (15,488.890) 
## BuurtDe Wester Quartier                       1,400.198 (11,347.670)   
## BuurtDe Wetbuurt                             -10,233.970 (14,688.540)  
## BuurtDe Wittenbuurt Noord                      832.239 (14,557.540)    
## BuurtDe Wittenbuurt Zuid                     -10,504.270 (17,842.630)  
## BuurtDelflandpleinbuurt Oost                  -9,969.239 (20,665.990)  
## BuurtDelflandpleinbuurt West                -25,056.080** (11,622.700) 
## BuurtDen Texbuurt                            -10,499.230 (13,911.720)  
## BuurtDiamantbuurt                               71.128 (9,925.603)     
## BuurtDiepenbrockbuurt                        -32,189.750 (26,567.750)  
## BuurtDon Bosco                               -13,654.840 (10,456.700)  
## BuurtDorp Driemond                            -2,926.733 (41,426.780)  
## BuurtDorp Sloten                             -21,748.010 (17,181.300)  
## BuurtDriehoekbuurt                           -12,461.040 (15,919.780)  
## BuurtDuivelseiland                            -7,903.559 (11,109.180)  
## BuurtDurgerdam                               -11,529.800 (17,536.860)  
## BuurtE-buurt                                 -19,935.350 (24,760.610)  
## BuurtEcowijk                                  -4,896.178 (17,397.800)  
## BuurtEendrachtspark                           4,198.607 (58,029.200)   
## BuurtElandsgrachtbuurt                        -1,788.972 (13,551.110)  
## BuurtElzenhagen Noord                         -5,226.040 (20,332.710)  
## BuurtElzenhagen Zuid                         -68,376.370 (58,114.570)  
## BuurtEmanuel van Meterenbuurt                 -8,969.874 (14,855.890)  
## BuurtEntrepot-Noordwest                      -10,606.900 (14,329.540)  
## BuurtErasmusparkbuurt Oost                    14,338.910 (13,401.500)  
## BuurtErasmusparkbuurt West                    -7,730.713 (11,829.840)  
## BuurtF-buurt                                  -6,800.002 (25,126.420)  
## BuurtFannius Scholtenbuurt                   -13,728.050 (13,472.890)  
## BuurtFelix Meritisbuurt                       1,739.221 (14,049.860)   
## BuurtFilips van Almondekwartier               -1,907.804 (11,974.300)  
## BuurtFlevopark                               -14,972.820 (29,404.030)  
## BuurtFrankendael                              31,401.110 (20,127.750)  
## BuurtFrans Halsbuurt                           1,259.693 (8,723.803)   
## BuurtFrederik Hendrikbuurt Noord              -6,999.611 (11,506.140)  
## BuurtFrederik Hendrikbuurt Zuidoost           -4,542.340 (10,580.100)  
## BuurtFrederik Hendrikbuurt Zuidwest           -1,375.316 (12,607.780)  
## BuurtFrederikspleinbuurt                      9,691.447 (13,715.510)   
## BuurtG-buurt Noord                           -15,376.980 (31,338.850)  
## BuurtG-buurt Oost                            -13,540.540 (23,414.710)  
## BuurtG-buurt West                             -5,975.510 (23,326.650)  
## BuurtGaasperdam Noord                         -2,979.015 (36,375.950)  
## BuurtGaasperdam Zuid                          -1,464.797 (42,337.610)  
## BuurtGaasperpark                             -20,787.690 (42,906.160)  
## BuurtGaasperplas                             -20,788.890 (42,439.010)  
## BuurtGein Noordoost                           -6,547.009 (35,386.800)  
## BuurtGein Noordwest                          -21,895.860 (38,348.200)  
## BuurtGein Zuidwest                            -1,459.335 (52,781.990)  
## BuurtGein Zuioost                            -10,123.240 (40,767.200)  
## BuurtGelderlandpleinbuurt                    -23,520.350* (12,925.620) 
## BuurtGerard Doubuurt                          -10,741.760 (8,664.883)  
## BuurtGeuzenhofbuurt                          -14,851.050 (11,191.250)  
## BuurtGibraltarbuurt                           -9,107.259 (14,107.580)  
## BuurtGouden Bocht                             -3,183.889 (17,371.140)  
## BuurtGroenmarktkadebuurt                     -10,587.100 (15,947.560)  
## BuurtGrunder/Koningshoef                      -5,688.314 (27,054.450)  
## BuurtHaarlemmerbuurt Oost                   42,647.680*** (16,267.920) 
## BuurtHaarlemmerbuurt West                     -9,449.227 (16,404.040)  
## BuurtHakfort/Huigenbos                        -2,052.775 (34,819.390)  
## BuurtHarmoniehofbuurt                         -4,092.068 (19,493.450)  
## BuurtHaveneiland Noord                         -621.714 (22,195.310)   
## BuurtHaveneiland Noordoost                   -21,969.150 (17,468.030)  
## BuurtHaveneiland Noordwest                   -19,417.160 (17,406.510)  
## BuurtHaveneiland Oost                        -17,617.610 (19,991.670)  
## BuurtHaveneiland Zuidwest/Rieteiland West    -20,231.860 (17,218.040)  
## BuurtHelmersbuurt Oost                        -7,942.993 (8,654.582)   
## BuurtHemelrijk                                 390.203 (16,217.710)    
## BuurtHemonybuurt                                 7.559 (8,072.055)     
## BuurtHercules Seghersbuurt                    -5,391.654 (9,526.823)   
## BuurtHet Funen                               -14,477.950 (13,663.260)  
## BuurtHiltonbuurt                             -28,619.390 (28,919.610)  
## BuurtHolendrecht Oost                          190.590 (36,637.270)    
## BuurtHolendrecht West                         1,783.520 (63,424.860)   
## BuurtHolysloot                                -9,220.498 (32,644.120)  
## BuurtHondecoeterbuurt                         -5,980.578 (11,199.100)  
## BuurtHoptille                                 24,553.480 (32,807.060)  
## BuurtHouthavens Oost                         -17,230.820 (19,512.820)  
## BuurtHouthavens West                         -19,223.150 (19,644.550)  
## BuurtIJplein e.o.                            -15,082.070 (13,835.130)  
## BuurtIJsbaanpad e.o.                          -3,403.390 (15,019.200)  
## BuurtIJselbuurt Oost                         -15,018.140 (10,658.060)  
## BuurtIJselbuurt West                         -10,117.890 (11,786.230)  
## BuurtJacob Geelbuurt                         -11,395.160 (26,524.250)  
## BuurtJacques Veldmanbuurt                     -7,889.456 (10,611.590)  
## BuurtJan Maijenbuurt                          -5,242.083 (11,529.740)  
## BuurtJava-eiland                             -20,424.690 (15,088.780)  
## BuurtJohan Jongkindbuurt                      -6,640.001 (26,438.160)  
## BuurtJohannnes Vermeerbuurt                   -7,188.490 (10,547.800)  
## BuurtJohn Franklinbuurt                       -5,113.546 (11,945.290)  
## BuurtJulianapark                             -18,605.740 (24,432.080)  
## BuurtK-buurt Midden                           44,518.170 (45,769.050)  
## BuurtK-buurt Zuidoost                        -21,640.130 (31,659.530)  
## BuurtK-buurt Zuidwest                        -29,823.320 (61,104.970)  
## BuurtKadijken                                 -8,771.264 (14,519.660)  
## BuurtKadoelen                                 -2,071.137 (24,521.440)  
## BuurtKalverdriehoek                          -14,518.300 (15,813.240)  
## BuurtKantershof                              -16,542.300 (28,929.380)  
## BuurtKattenburg                              -16,012.720 (16,375.980)  
## BuurtKazernebuurt                             -4,025.522 (18,444.650)  
## BuurtKelbergen                               -18,170.610 (40,227.490)  
## BuurtKNSM-eiland                             -20,531.010 (12,703.520)  
## BuurtKolenkitbuurt Noord                      -2,184.970 (17,600.200)  
## BuurtKolenkitbuurt Zuid                      -16,275.300 (14,718.360)  
## BuurtKoningin Wilhelminaplein                 -8,858.745 (12,762.340)  
## BuurtKop Zeedijk                              1,422.760 (15,989.520)   
## BuurtKop Zuidas                              -20,550.920 (21,501.400)  
## BuurtKortenaerkwartier                        -3,236.467 (10,942.110)  
## BuurtKortvoort                                 -879.992 (35,219.440)   
## BuurtKromme Mijdrechtbuurt                   -11,023.870 (11,643.800)  
## BuurtL-buurt                                 -14,341.890 (28,805.100)  
## BuurtLaan van Spartaan                       -17,454.220 (15,192.190)  
## BuurtLandelijk gebied Driemond                27,276.040 (43,210.990)  
## BuurtLandlust Noord                           -6,630.044 (13,904.980)  
## BuurtLandlust Zuid                            -4,415.774 (11,877.330)  
## BuurtLangestraat e.o.                         -3,282.268 (15,604.520)  
## BuurtLastage                                  2,393.746 (15,419.450)   
## BuurtLegmeerpleinbuurt                       50,891.130*** (9,804.688) 
## BuurtLeidsebuurt Noordoost                    -9,783.578 (13,290.460)  
## BuurtLeidsebuurt Noordwest                   -19,168.780 (15,379.050)  
## BuurtLeidsebuurt Zuidoost                     -8,371.650 (15,052.400)  
## BuurtLeidsebuurt Zuidwest                    -18,956.850 (16,507.210)  
## BuurtLeidsegracht Noord                       -1,574.186 (15,331.890)  
## BuurtLeidsegracht Zuid                       -11,818.750 (15,154.090)  
## BuurtLeliegracht e.o.                         -7,836.597 (14,949.540)  
## BuurtLinnaeusparkbuurt                       -12,169.130 (10,568.980)  
## BuurtLizzy Ansinghbuurt                       -11,882.950 (9,845.180)  
## BuurtLoenermark                              -10,132.680 (24,582.830)  
## BuurtLootsbuurt                              35,308.470*** (9,759.124) 
## BuurtLouis Crispijnbuurt                      -5,319.617 (17,783.320)  
## BuurtLucas/Andreasziekenhuis e.o.             4,427.061 (23,867.170)   
## BuurtMarathonbuurt Oost                       -6,973.200 (11,224.020)  
## BuurtMarathonbuurt West                      -21,822.040** (9,374.770) 
## BuurtMarcanti                                -11,195.690 (14,924.330)  
## BuurtMarine-Etablissement                     -3,004.244 (17,214.840)  
## BuurtMarjoleinterrein                          -159.116 (36,598.560)   
## BuurtMarkengouw Midden                       -17,941.570 (20,297.970)  
## BuurtMarkengouw Noord                         18,851.390 (42,326.270)  
## BuurtMarkengouw Zuid                         -69,229.180 (57,705.610)  
## BuurtMarkthallen                              2,254.054 (21,793.010)   
## BuurtMarnixbuurt Midden                        637.679 (17,663.240)    
## BuurtMarnixbuurt Noord                        -3,767.462 (16,981.730)  
## BuurtMarnixbuurt Zuid                        -17,059.970 (16,652.990)  
## BuurtMedisch Centrum Slotervaart             -18,261.710 (56,504.860)  
## BuurtMeer en Oever                           -17,143.250 (18,775.970)  
## BuurtMercatorpark                             -2,671.739 (22,215.420)  
## BuurtMiddelveldsche Akerpolder               -23,204.510 (24,277.450)  
## BuurtMiddenmeer Noord                         -9,214.355 (10,949.920)  
## BuurtMiddenmeer Zuid                         -20,081.270** (9,703.122) 
## BuurtMinervabuurt Midden                      -3,504.232 (13,435.620)  
## BuurtMinervabuurt Noord                       3,887.702 (13,652.830)   
## BuurtMinervabuurt Zuid                       -24,881.880* (12,821.910) 
## BuurtMolenwijk                               -11,028.040 (41,181.700)  
## BuurtMuseumplein                              -9,497.573 (22,389.170)  
## BuurtNDSM terrein                             -7,426.501 (24,835.070)  
## BuurtNes e.o.                                 2,951.705 (15,575.480)   
## BuurtNieuw Sloten Noordoost                    -713.447 (22,717.700)   
## BuurtNieuw Sloten Noordwest                   -9,167.478 (15,097.410)  
## BuurtNieuw Sloten Zuidoost                   -22,787.170 (22,779.590)  
## BuurtNieuw Sloten Zuidwest                   -13,631.110 (19,864.300)  
## BuurtNieuwe Diep/Diemerpark                  -27,212.900 (26,515.340)  
## BuurtNieuwe Kerk e.o.                         -3,790.079 (15,213.340)  
## BuurtNieuwe Meer                             -20,139.540 (57,423.030)  
## BuurtNieuwe Oosterbegraafplaats              -44,411.240 (40,348.720)  
## BuurtNieuwendammerdijk Oost                  -20,275.170 (20,133.280)  
## BuurtNieuwendammerdijk Zuid                   -7,611.817 (27,907.100)  
## BuurtNieuwendammmerdijk West                 -11,472.800 (15,983.480)  
## BuurtNieuwendijk Noord                        1,370.748 (17,438.240)   
## BuurtNieuwmarkt                               15,452.400 (15,049.820)  
## BuurtNintemanterrein                          -9,596.249 (42,317.120)  
## BuurtNoorder IJplas                          -118,894.100 (79,393.810) 
## BuurtNoorderstrook Oost                      -30,367.050 (57,900.040)  
## BuurtNoorderstrook West                       17,748.550 (43,869.240)  
## BuurtNoordoever Sloterplas                   -14,164.410 (15,565.260)  
## BuurtNoordoostkwadrant Indische buurt        -20,710.460** (8,977.755) 
## BuurtNoordwestkwadrant Indische buurt Noord   -13,200.650 (8,108.074)  
## BuurtNoordwestkwadrant Indische buurt Zuid    -8,457.326 (8,290.103)   
## BuurtOlympisch Stadion e.o.                  -25,243.850 (15,884.360)  
## BuurtOokmeer                                 -31,512.250 (27,406.510)  
## BuurtOostelijke Handelskade                  -20,738.130 (18,561.340)  
## BuurtOostenburg                                -91.319 (11,388.220)    
## BuurtOosterdokseiland                         29,577.270 (24,009.020)  
## BuurtOosterpark                                -26.412 (14,205.260)    
## BuurtOosterparkbuurt Noordwest                -5,003.169 (8,302.038)   
## BuurtOosterparkbuurt Zuidoost                 -11,494.110 (8,488.890)  
## BuurtOosterparkbuurt Zuidwest                 -2,799.048 (9,523.040)   
## BuurtOostoever Sloterplas                    -12,305.400 (15,657.010)  
## BuurtOostpoort                               -11,959.610 (10,549.310)  
## BuurtOostzanerdijk                           -41,049.390 (30,594.570)  
## BuurtOrteliusbuurt Midden                    -10,060.020 (11,376.470)  
## BuurtOrteliusbuurt Noord                      -9,962.109 (12,475.650)  
## BuurtOrteliusbuurt Zuid                       -4,687.338 (10,694.550)  
## BuurtOsdorp Midden Noord                     -18,347.060 (21,721.090)  
## BuurtOsdorp Midden Zuid                       -3,147.268 (21,444.700)  
## BuurtOsdorp Zuidoost                         -21,676.350 (14,056.950)  
## BuurtOsdorper Binnenpolder                   -45,039.420 (28,189.230)  
## BuurtOsdorper Bovenpolder                    -29,794.400 (35,507.190)  
## BuurtOsdorpplein e.o.                        -23,090.350 (18,664.760)  
## BuurtOude Kerk e.o.                           18,407.620 (15,306.030)  
## BuurtOveramstel                              -25,971.760 (58,985.110)  
## BuurtOverbraker Binnenpolder                 -29,042.730 (32,035.400)  
## BuurtOverhoeks                               -14,629.690 (23,469.490)  
## BuurtOvertoomse Veld Noord                   -11,258.580 (12,678.850)  
## BuurtOvertoomse Veld Zuid                    -20,831.400 (12,668.760)  
## BuurtP.C. Hooftbuurt                         -13,185.290 (12,812.790)  
## BuurtPapaverweg e.o.                          -5,062.001 (17,738.150)  
## BuurtParamariboplein e.o.                     -8,729.570 (8,514.198)   
## BuurtPark de Meer                            -18,276.300 (17,931.650)  
## BuurtPark Haagseweg                          -21,272.260 (40,875.390)  
## BuurtParooldriehoek                          -14,954.730 (14,114.080)  
## BuurtPasseerdersgrachtbuurt                   -7,261.779 (15,515.660)  
## BuurtPieter van der Doesbuurt                 -7,542.986 (11,684.560)  
## BuurtPlan van Gool                            -4,951.264 (18,981.630)  
## BuurtPlanciusbuurt Noord                     -36,874.130 (23,965.410)  
## BuurtPlanciusbuurt Zuid                      -22,306.930 (42,883.670)  
## BuurtPlantage                                 -8,667.484 (13,907.580)  
## BuurtPostjeskade e.o.                         -8,606.942 (8,908.611)   
## BuurtPrinses Irenebuurt                      -21,149.750 (14,979.390)  
## BuurtRAI                                     -17,120.780 (22,332.270)  
## BuurtRansdorp                                -22,263.000 (27,201.400)  
## BuurtRapenburg                               -15,534.420 (14,863.240)  
## BuurtRechte H-buurt                           4,435.680 (31,724.250)   
## BuurtReguliersbuurt                          -12,688.720 (19,003.780)  
## BuurtReigersbos Midden                        8,493.168 (38,656.990)   
## BuurtReigersbos Noord                         7,511.086 (37,193.690)   
## BuurtReigersbos Zuid                          -5,288.695 (45,431.380)  
## BuurtRembrandtpark Noord                     -16,221.700 (14,608.340)  
## BuurtRembrandtpark Zuid                       -6,137.339 (12,196.060)  
## BuurtRembrandtpleinbuurt                      5,741.425 (15,103.520)   
## BuurtRI Oost terrein                          -1,699.868 (13,403.220)  
## BuurtRieteiland Oost                         -34,610.600 (31,954.070)  
## BuurtRietlanden                              -15,122.680 (13,246.880)  
## BuurtRijnbuurt Midden                        -17,828.720 (12,202.000)  
## BuurtRijnbuurt Oost                           -5,093.010 (11,445.190)  
## BuurtRijnbuurt West                          -11,762.060 (16,077.490)  
## BuurtRobert Scottbuurt Oost                   -9,702.014 (13,125.380)  
## BuurtRobert Scottbuurt West                   -7,398.977 (12,843.880)  
## BuurtRode Kruisbuurt                         -12,267.370 (31,189.830)  
## BuurtSarphatiparkbuurt                        -1,488.914 (7,958.067)   
## BuurtSarphatistrook                           -6,433.254 (12,729.460)  
## BuurtScheepvaarthuisbuurt                     -8,545.844 (15,097.800)  
## BuurtScheldebuurt Midden                     -17,424.510 (11,095.550)  
## BuurtScheldebuurt Oost                        -4,894.653 (11,718.780)  
## BuurtScheldebuurt West                       -20,817.630* (10,846.200) 
## BuurtSchellingwoude Oost                     -22,986.360 (15,071.530)  
## BuurtSchellingwoude West                     -21,875.230 (22,580.170)  
## BuurtSchinkelbuurt Noord                      -12,570.720 (7,932.291)  
## BuurtSchinkelbuurt Zuid                       -8,773.009 (10,551.000)  
## BuurtSchipluidenbuurt                        -25,085.440 (29,160.650)  
## BuurtScience Park Noord                      -14,025.700 (14,253.000)  
## BuurtScience Park Zuid                        -8,987.078 (33,534.720)  
## BuurtSlotermeer Zuid                          -8,959.439 (17,293.280)  
## BuurtSloterpark                              -35,670.930 (25,954.150)  
## BuurtSloterweg e.o.                           15,353.880 (22,989.150)  
## BuurtSpaarndammerbuurt Midden                -16,651.530 (18,538.160)  
## BuurtSpaarndammerbuurt Noordoost             -17,295.380 (16,359.230)  
## BuurtSpaarndammerbuurt Noordwest             -22,250.110 (19,695.480)  
## BuurtSpaarndammerbuurt Zuidoost               -7,744.012 (16,580.520)  
## BuurtSpaarndammerbuurt Zuidwest              -12,021.750 (15,891.960)  
## BuurtSpiegelbuurt                             -3,333.827 (13,711.910)  
## BuurtSporenburg                              -21,145.350* (11,679.690) 
## BuurtSportpark Middenmeer Noord               -4,332.089 (33,349.470)  
## BuurtSportpark Middenmeer Zuid               -35,627.330 (33,230.850)  
## BuurtSportpark Voorland                       47,930.460 (56,671.490)  
## BuurtSpuistraat Noord                         -3,873.748 (15,311.280)  
## BuurtSpuistraat Zuid                          13,044.390 (15,767.000)  
## BuurtStaalmanbuurt                            -8,041.261 (11,277.820)  
## BuurtStaatsliedenbuurt Noordoost             -23,021.170 (15,112.890)  
## BuurtStationsplein e.o.                      -45,898.790 (57,772.730)  
## BuurtSteigereiland Noord                     -17,613.590 (15,449.510)  
## BuurtSteigereiland Zuid                      -13,763.690 (13,162.780)  
## BuurtSurinamepleinbuurt                      -14,333.230 (10,153.200)  
## BuurtSwammerdambuurt                          -2,817.918 (8,710.319)   
## BuurtTeleport                                -33,036.530 (42,831.450)  
## BuurtTerrasdorp                               -6,277.521 (21,802.840)  
## BuurtTransvaalbuurt Oost                      -2,033.633 (8,653.721)   
## BuurtTransvaalbuurt West                      -12,214.380 (9,507.166)  
## BuurtTrompbuurt                               -6,089.512 (10,929.240)  
## BuurtTuindorp Amstelstation                   8,716.206 (23,252.370)   
## BuurtTuindorp Frankendael                   -30,133.200** (15,319.610) 
## BuurtTuindorp Nieuwendam Oost                 -4,455.441 (15,999.460)  
## BuurtTuindorp Nieuwendam West                 -4,193.887 (20,137.340)  
## BuurtTuindorp Oostzaan Oost                   -4,388.822 (24,322.970)  
## BuurtTuindorp Oostzaan West                  -28,878.280 (37,226.030)  
## BuurtTwiske Oost                              37,675.010 (46,980.680)  
## BuurtTwiske West                             -17,394.130 (29,949.710)  
## BuurtUilenburg                                1,994.829 (15,172.440)   
## BuurtUtrechtsebuurt Zuid                      -1,389.929 (14,016.330)  
## BuurtValeriusbuurt Oost                        323.068 (12,438.950)    
## BuurtValeriusbuurt West                      -10,380.730 (10,338.540)  
## BuurtValkenburg                               -9,811.787 (15,946.620)  
## BuurtVan Brakelkwartier                       -9,692.878 (14,038.960)  
## BuurtVan der Helstpleinbuurt                 -14,635.590* (8,301.621)  
## BuurtVan der Kunbuurt                         10,595.920 (24,628.600)  
## BuurtVan der Pekbuurt                        -12,857.410 (14,785.060)  
## BuurtVan Loonbuurt                           -16,207.540 (13,633.780)  
## BuurtVan Tuyllbuurt                          -16,105.840* (9,069.247)  
## BuurtVelserpolder West                        14,467.650 (22,333.130)  
## BuurtVeluwebuurt                             -24,144.810 (19,833.220)  
## BuurtVenserpolder Oost                        11,672.310 (20,431.000)  
## BuurtVliegenbos                               11,228.420 (16,842.880)  
## BuurtVogelbuurt Noord                        -13,637.700 (17,361.950)  
## BuurtVogelbuurt Zuid                          3,918.417 (13,183.060)   
## BuurtVogeltjeswei                             20,321.210 (44,905.370)  
## BuurtVondelpark Oost                         -12,525.380 (28,746.920)  
## BuurtVondelpark West                          -7,704.670 (16,260.600)  
## BuurtVondelparkbuurt Midden                   3,184.597 (10,487.350)   
## BuurtVondelparkbuurt Oost                     -7,768.348 (10,188.400)  
## BuurtVondelparkbuurt West                     -3,428.592 (8,504.418)   
## BuurtVU-kwartier                              -8,255.928 (26,606.350)  
## BuurtWalvisbuurt                             -10,282.530 (32,162.300)  
## BuurtWaterloopleinbuurt                        -995.185 (17,082.000)   
## BuurtWeesperbuurt                            -10,143.520 (12,464.880)  
## BuurtWeespertrekvaart                         -3,613.675 (18,122.740)  
## BuurtWeesperzijde Midden/Zuid                 -11,510.980 (9,416.461)  
## BuurtWerengouw Midden                        -15,261.060 (17,188.640)  
## BuurtWerengouw Noord                            44.493 (41,870.010)    
## BuurtWerengouw Zuid                          -24,620.540 (19,997.120)  
## BuurtWestelijke eilanden                      -7,478.560 (16,531.650)  
## BuurtWesterdokseiland                        -13,114.810 (15,025.260)  
## BuurtWestergasfabriek                         -1,686.973 (19,624.720)  
## BuurtWesterstaatsman                          -6,785.331 (13,240.610)  
## BuurtWestlandgrachtbuurt                     -14,308.750* (8,310.584)  
## BuurtWeteringbuurt                            -6,079.378 (12,600.200)  
## BuurtWG-terrein                                2,112.995 (9,052.888)   
## BuurtWielingenbuurt                          -17,555.950 (12,514.190)  
## BuurtWildeman                                -26,263.310 (17,658.100)  
## BuurtWillemsparkbuurt Noord                   2,716.918 (10,455.200)   
## BuurtWillibrordusbuurt                        -2,460.136 (8,715.516)   
## BuurtWittenburg                              -10,007.930 (12,385.250)  
## BuurtWoon- en Groengebied Sloterdijk         -18,757.580 (25,720.690)  
## BuurtZaagpoortbuurt                           -8,867.990 (16,795.760)  
## BuurtZamenhofstraat e.o.                     -18,760.870 (57,239.360)  
## BuurtZeeburgerdijk Oost                      -10,445.620 (40,747.330)  
## BuurtZeeburgereiland Noordoost                14,361.180 (33,868.850)  
## BuurtZeeburgereiland Noordwest                -9,004.920 (29,499.000)  
## BuurtZeeburgereiland Zuidoost                 10,913.910 (56,943.330)  
## BuurtZeeburgereiland Zuidwest                -19,883.710 (20,474.900)  
## BuurtZeeheldenbuurt                          -13,671.810 (15,397.250)  
## BuurtZorgvlied                               -32,237.310 (34,529.890)  
## BuurtZuidas Noord                            -15,548.150 (29,352.630)  
## BuurtZuidas Zuid                              -5,044.906 (17,472.730)  
## BuurtZuiderhof                               -23,179.270 (56,833.920)  
## BuurtZuiderkerkbuurt                          -3,648.583 (14,898.720)  
## BuurtZuidoostkwadrant Indische buurt         -14,939.670 (10,191.320)  
## BuurtZuidwestkwadrant Indische buurt         -10,862.380 (10,233.330)  
## BuurtZuidwestkwadrant Osdorp Noord           -19,933.180 (20,170.920)  
## BuurtZuidwestkwadrant Osdorp Zuid            -18,923.320 (11,841.530)  
## BuurtZunderdorp                               -6,302.578 (23,623.350)  
## host_is_superhostf                            15,894.890 (28,426.430)  
## host_is_superhostt                            12,785.280 (28,446.140)  
## room_typePrivate room                       -11,802.700*** (1,265.647) 
## room_typeShared room                          -4,815.139 (7,260.233)   
## property_typeApartment                       43,082.970*** (7,248.714) 
## property_typeBarn                             28,276.010 (29,530.720)  
## property_typeBed and breakfast               33,624.280*** (7,791.096) 
## property_typeBoat                            50,957.830*** (7,908.296) 
## property_typeBoutique hotel                  32,061.260** (13,169.310) 
## property_typeBungalow                         29,639.260 (21,991.320)  
## property_typeCabin                           35,468.430** (17,122.620) 
## property_typeCamper/RV                        24,537.630 (43,112.450)  
## property_typeCampsite                         42,959.310 (42,505.690)  
## property_typeCasa particular (Cuba)          48,560.900* (26,281.760)  
## property_typeCastle                           54,056.800 (56,916.680)  
## property_typeChalet                           24,922.210 (33,757.540)  
## property_typeCondominium                     44,622.510*** (7,902.195) 
## property_typeCottage                         32,902.680* (18,750.250)  
## property_typeEarth house                      19,470.900 (56,588.910)  
## property_typeGuest suite                     37,283.640*** (8,643.844) 
## property_typeGuesthouse                     45,351.830*** (11,713.960) 
## property_typeHostel                           35,166.730 (30,552.600)  
## property_typeHotel                            19,776.940 (24,296.300)  
## property_typeHouse                           42,527.720*** (7,454.560) 
## property_typeHouseboat                       46,622.500*** (8,233.094) 
## property_typeLighthouse                       9,220.159 (58,873.520)   
## property_typeLoft                            43,902.860*** (7,771.848) 
## property_typeNature lodge                     22,368.520 (60,601.140)  
## property_typeOther                          38,507.460*** (10,896.820) 
## property_typeServiced apartment              18,652.390* (10,220.100)  
## property_typeTent                             59,151.250 (59,422.320)  
## property_typeTiny house                      50,278.150* (26,448.660)  
## property_typeTownhouse                       45,168.620*** (7,616.014) 
## property_typeVilla                          41,410.970*** (12,796.160) 
## bed_typeCouch                                 2,500.072 (31,781.630)   
## bed_typeFuton                                  -852.284 (16,890.650)   
## bed_typePull-out Sofa                          109.502 (15,404.320)    
## bed_typeReal Bed                              3,544.437 (14,644.340)   
## minimum_nights                                  -98.868*** (32.544)    
## dist.museum                                       -0.187 (3.144)       
## dist.supermarkets                                 -3.132 (4.352)       
## Unescowithin                                  -1,981.182 (9,288.370)   
## dist.metro                                        -1.434 (4.236)       
## dist.plaza                                        -3.963 (3.786)       
## dist.nightclub                                     4.178 (3.563)       
## dist.beach                                        -4.586 (3.441)       
## dist.parks                                                             
## name.brightnot bright                         -1,759.521 (1,726.801)   
## name.spaciousspacious                        4,269.843*** (1,213.611)  
## name.luxurynot luxury                         -3,080.513 (1,989.613)   
## amenities.number                                 -40.087 (52.228)      
## lagRevenue                                       -0.078*** (0.016)     
## Constant                                    465,051.500 (7,856,334.000)
## -----------------------------------------------------------------------
## Observations                                          19,980           
## R2                                                     0.072           
## Adjusted R2                                            0.046           
## Residual Std. Error                           55,927.570 (df = 19434)  
## F Statistic                                 2.776*** (df = 545; 19434) 
## =======================================================================
## Note:                                       *p<0.1; **p<0.05; ***p<0.01
ggplot(annualrev_predict_test,
         aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

ggplot(annualrev_predict_test %>%
        filter(APE<1.5),
         aes(x=APE)) + 
  labs(title = "APE Distribution",caption = "Figure XX. A histogram of APE") +
  geom_histogram()+
  plotTheme()

  • plot error by neighborhood2
annualrev_predict_test%>%
  filter(APE<1.5)%>%
  group_by(Buurt) %>%
  summarize(mean.APE = mean(APE, na.rm = T)) %>%
  ungroup() %>% 
  left_join(neighbor2,by = "Buurt") %>%
    st_sf() %>%
    ggplot() + 
      geom_sf(aes(fill = mean.APE),colour = 'transparent') +
      scale_fill_gradient(low = palette5[1], high = palette5[5],
                          name = "MAPE") +
      labs(title = "Mean test set MAPE by Buurt",
           subtitle = "2019") +
      mapTheme

Compared to approach 1, this approach is more generalizable across space. There are fewer places with high MAPEs of prediction and those listings with high MAPE are also dispersed at the outskirt.

###5.3 Cross Validation (for annual revenue prediction only)

k-folds cross validation

compare the baseline regression to see how much we improved the model

####5.3.1 normal cv

#calculate annual revenue and join it back to detais.sf
annualrevenue.raw <- revenue_panel %>% 
  group_by(id) %>% 
  summarise(annual_revenue = sum(revenue)) %>% 
  dplyr::select(id, annual_revenue)

annualrevenue <- left_join(details.sf,annualrevenue.raw,by = "id")
annualrevenue <- merge(annualrevenue, listing.sf.neighbor2[c("id", "Buurt")], by = "id")

annualrevenue$Buurt <- tidyr::replace_na(annualrevenue$Buurt, "NA")
# use caret package cross-validation method
fitControl <- trainControl(method = "cv", 
                           number = 20,
                           # savePredictions differs from book
                           savePredictions = TRUE)

set.seed(856)

# for k-folds CV

#Run Regression using K fold CV

# annual revenue
reg.cv.revenue <- 
  train(annual_revenue ~ ., data = st_drop_geometry(annualrevenue) %>% 
    dplyr::select(annual_revenue,beds, bedrooms, bathrooms, accommodates,
                              pool, parking, kitchen, AC, fireplace,
                              Buurt,host_is_superhost,
                              room_type,property_type,bed_type,
                              minimum_nights,dist.museum,dist.supermarkets,
                              Unesco,dist.metro,dist.plaza, dist.nightclub,
                              dist.beach, dist.parks,
                              name.bright, name.spacious,name.luxury,
                              amenities.number
                  )%>%
      na.omit(), 
     method = "lm", 
     trControl = fitControl, 
     na.action = na.pass,)

revenue.cv.MAE <- reg.cv.revenue$results$MAE
revenue.cv.MAESD <- reg.cv.revenue$results$MAESD

revenue.cvtable <- matrix(ncol = 2, c(revenue.cv.MAE, revenue.cv.MAESD), byrow = F)
rownames(revenue.cvtable) <- "Value"
colnames(revenue.cvtable) <- c("MAE", "MAESD")

revenue.cvtable %>% 
  kable(caption = "Table of MAE & MAESD for k-fold cross-validation (annual revenue)") %>%
  kable_styling("striped", full_width = F)
Table of MAE & MAESD for k-fold cross-validation (annual revenue)
MAE MAESD
Value 25666.97 1565.045
reg.cv.revenue.base <- 
  train(annual_revenue ~ ., data = st_drop_geometry(annualrevenue) %>% 
    dplyr::select(annual_revenue,beds, bedrooms, bathrooms, accommodates)%>%
      na.omit(), 
     method = "lm", 
     trControl = fitControl, 
     na.action = na.pass)

revenue.cv.MAE <- reg.cv.revenue.base$results$MAE
revenue.cv.MAESD <- reg.cv.revenue.base$results$MAESD

revenue.cvtable <- matrix(ncol = 2, c(revenue.cv.MAE, revenue.cv.MAESD), byrow = F)
rownames(revenue.cvtable) <- "Value"
colnames(revenue.cvtable) <- c("MAE", "MAESD")

revenue.cvtable %>% 
  kable(caption = "Table of MAE & MAESD for k-fold cross-validation (annual revenue. base)") %>%
  kable_styling("striped", full_width = F)
Table of MAE & MAESD for k-fold cross-validation (annual revenue. base)
MAE MAESD
Value 26091.55 1132.515

Adding new features helps us lower MAE when predicting annual revenues, but it also increases MAESD as well.

# price
reg.cv.price <- 
  train(price ~ ., data = st_drop_geometry(annualrevenue) %>% 
    dplyr::select(price,beds, bedrooms, bathrooms, accommodates,
                              pool, parking, kitchen, AC, fireplace,
                              Buurt,host_is_superhost,
                              room_type,property_type,bed_type,
                              minimum_nights,dist.museum,dist.supermarkets,
                              Unesco,dist.metro,dist.plaza, dist.nightclub,
                              dist.beach, dist.parks,
                              name.bright, name.spacious,name.luxury,
                              amenities.number
                  )%>%
      na.omit(), 
     method = "lm", 
     trControl = fitControl, 
     na.action = na.pass)

price.cv.MAE <- reg.cv.price$results$MAE
price.cv.MAESD <- reg.cv.price$results$MAESD

price.cvtable <- matrix(ncol = 2, c(price.cv.MAE, price.cv.MAESD), byrow = F)
rownames(price.cvtable) <- "Value"
colnames(price.cvtable) <- c("MAE", "MAESD")

price.cvtable %>% 
  kable(caption = "Table of MAE & MAESD for k-fold cross-validation (price)") %>%
  kable_styling("striped", full_width = F)
Table of MAE & MAESD for k-fold cross-validation (price)
MAE MAESD
Value 43.45604 2.066383
reg.cv.price.base <- 
  train(price ~ ., data = st_drop_geometry(annualrevenue) %>% 
    dplyr::select(price,beds, bedrooms, bathrooms, accommodates), 
     method = "lm", 
     trControl = fitControl, 
     na.action = na.pass)


price.cv.MAE <- reg.cv.price.base$results$MAE
price.cv.MAESD <- reg.cv.price.base$results$MAESD

price.cvtable <- matrix(ncol = 2, c(price.cv.MAE, price.cv.MAESD), byrow = F)
rownames(price.cvtable) <- "Value"
colnames(price.cvtable) <- c("MAE", "MAESD")

price.cvtable %>% 
  kable(caption = "Table of MAE & MAESD for k-fold cross-validation (price.base)") %>%
  kable_styling("striped", full_width = F)
Table of MAE & MAESD for k-fold cross-validation (price.base)
MAE MAESD
Value 48.94492 1.541473

Adding new features helps us decrease both MAE and MAESD when predicting prices.

reg.cv.revenue.resample <- reg.cv.revenue$resample
reg.cv.revenue.base.resample <- reg.cv.revenue.base$resample
reg.cv.price.resample <- reg.cv.price$resample
reg.cv.price.base.resample <- reg.cv.price.base$resample


var_list <- list()

var_list[[1]] <- ggplot(reg.cv.revenue.resample, aes(x=MAE)) + geom_histogram(color = "grey30", fill = "#4757A2", bins = 50) + 
  ylim(0,4)+xlim(22000,30000)+
  labs(title="Histogram of Mean Average Error Across 20 Folds, Revenue") +
  plotTheme()

var_list[[2]] <- ggplot(reg.cv.revenue.base.resample, aes(x=MAE)) + geom_histogram(color = "grey30", fill = "#4757A2", bins = 50) + 
  ylim(0,4)+xlim(22000,30000)+
  labs(title="Histogram of Mean Average Error Across 20 Folds, Revenue Baseline") +
  plotTheme()

var_list[[3]] <- ggplot(reg.cv.price.resample, aes(x=MAE)) + geom_histogram(color = "grey30", fill = "#4757A2", bins = 50) + 
  ylim(0,4)+xlim(37,56)+
  labs(title="Histogram of Mean Average Error Across 20 Folds, Price") +
  plotTheme()

var_list[[4]] <- ggplot(reg.cv.price.base.resample, aes(x=MAE)) + geom_histogram(color = "grey30", fill = "#4757A2", bins = 50) + 
  ylim(0,4)+xlim(37,56)+
  labs(title="Histogram of Mean Average Error Across 20 Folds, Price Baseline") +
  plotTheme()

do.call(grid.arrange,c(var_list, ncol = 2, top = "Histogram of MAEs"))

Histograms above also prove our conclusions. New regressions (with new features) perform better than baseline as the distributions of MAE move towards lower (left).

6.Discussion

6.1 How dose our analysis meet the use case we set out to address?

The goal of the algorithm is to predict the direct economic income that can be brought back to the community by a new Airbnb lisiting. We also want to inform the residents about the changing occupancy rate along time, letting them know when the visitors will be staying in the neighborhood. Generally speaking, our algorithm succeeded in predicting the revenue with acceptable error around 38%, and the errors mainly happen on the fringe of Amsterdam, where Airbnb density is lower and outlier concentrate. But we’re not predicting the occupancy rate very well, probably because of more subjective data such as rating and comments are not included.

6.2 How to improve the model?

Overall, our algorithm doesn’t perform well enough in accuracy and our prediction on prices is better than occupancy and revenue. That’s mainly because occupancy is hard to predict without previous data (time lag). To predict occupancy, what people can do is to predict the occupancy in next week or next month based on the occupancy this week and continue doing it for a year. This approach seems better than ours because occupancy is strongly related to time, not only space. However, this approach makes no sense in our use case as we have to predict the annual for a new listing. We can neither obtain its previous occupancy, nor predict it month by month. In order to improve our algorithm while ensure it can work in our use case, we suggest trying on the following approaches:

  • Clean the data

There are quite a few outliers in the data set. Some listings have extremely high prices in certain months with zero occupancy. We excluded some of the outliers but not all of them, they account for some extremely large errors. To further improve our model, we will try to find the commonality of these outliers and get rid of them.

  • Include more features

As we mentioned before, occupancy is more volatile than price, and depends less on physical features. Also, price itself can also be influencing the occupancy. Because we didn’t find crime and population data at smaller geography, we didn’t test the generalizablity among different socio-econimic context, which are also likely to influence price and occupancy.

  • Use other regressions

During our research, the relationship between some features and the dependent variable is not linear. We tried to use logarithm or reciprocal to convert the variables but didn’t make much progress. From the cases of Airbnb predictions that we researched on, there are some other regression that performs better then OLS, suchs as XGBoost and Random Forest. Maybe by using these regressions, we can also improve our predictions.


Bibliography

amsterdam attractions | http://tour-pedia.org/about/datasets.html