Housing prices in the United States vary greatly depending on the region. Factors such as supply and demand, population growth, job opportunities, and local economic conditions all play a role in determining the cost of housing in a given area. In general, housing prices tend to be higher in urban areas and on the coasts, while rural areas and the Midwest have lower housing costs. The West Coast, particularly California, is notorious for having some of the highest housing prices in the country. San Francisco, Los Angeles, and San Diego are among the most expensive cities in the country, with median home prices well above 1 million dollars. The tech industry has driven up housing costs in the Bay Area, while entertainment and tourism play a significant role in Southern California’s high housing costs. New York City is another region with very high housing prices, driven by its status as a major financial center and a global hub of commerce and culture. The city’s limited space and high demand for housing have resulted in sky-high prices, with median home prices over 500,000 dollars. The suburbs around New York City, such as Westchester County and Long Island, also have relatively high housing costs due to their proximity to the city. In contrast, the Midwest generally has lower housing costs, with many cities offering affordable housing options. Cities such as Cleveland, Detroit, and Milwaukee have median home prices below 200,000 dollars, making them attractive options for first-time homebuyers and those looking for more affordable living. In addition, the Southern states, such as Texas, Florida, and Georgia, offer a range of affordable housing options in both urban and rural areas.
Overall, it is clear that housing prices in the United States vary greatly depending on the region. While some areas, such as the West Coast and New York City, have notoriously high housing costs, other regions offer more affordable housing options. It is important for individuals and families to consider their budget, job, and other financial factors when deciding for housing.
Tracking trends in housing prices in the United States is an important task for individuals, businesses, and policymakers alike. This information can provide valuable insights into the health of the housing market and the broader economy, as well as help inform important financial decisions. One key reason for tracking housing price trends is to identify potential investment opportunities. Real estate investors can use this information to identify areas where housing prices are rising, indicating that there is likely to be strong demand for housing in the area. Conversely, investors can avoid areas where housing prices are falling, which may indicate that there is oversupply or weak demand for housing. Housing price trends can also provide important insights into the broader economy. In general, rising housing prices can indicate that the economy is strong, with low unemployment, strong consumer confidence, and a healthy housing market. Falling housing prices, on the other hand, can be a sign of economic weakness, with high unemployment, low consumer confidence, and a weak housing market. For policymakers, housing price trends can help inform decisions related to economic policy and regulation. For example, if housing prices are rising rapidly, policymakers may need to take steps to prevent a housing bubble, such as increasing interest rates or tightening lending standards. On the other hand, if housing prices are falling, policymakers may need to take steps to stimulate demand for housing, such as lowering interest rates or providing incentives for homebuyers. Finally, tracking housing price trends is important for individuals and families looking to buy or sell a home. By understanding how housing prices are trending in their local market, individuals can make informed decisions about when to buy or sell a home, and at what price. This can help them maximize their financial returns and make the most of their investment. Overall, tracking trends in housing prices in the United States is a crucial task with far-reaching implications for investors, policymakers, and individuals alike. By staying up-to-date on housing market trends, stakeholders can make informed decisions and take advantage of opportunities in the real estate market.
Sociological Forum, Vol. 32, No. 4, December 2017 DOI: 10.1111/socf.12378 © 2017 Eastern Sociological Society
This paper analyzes a lot of sociological and political factors that lead to a difference in real estate prices. Typically, people focus on factors such as the size of the house, bedrooms, bathrooms, finished basements, location, and orientation of the house (north, south, tc); this paper takes a look at less tangible factors like racism, segregation and gentrification. Real Estate Agents (REAs) play a disproportionate role in determining the layout and makeup of neighborhoods: “While a large body of evidence suggests that REAs impact individual-level decisions in the search for housing, the mechanisms by which they do so are rarely defined and are generally based on dated data”. REAs tend to be more concentrated in more affluent and white areas and they tend to try and upsell houses in this area and artificially inflate values of property due to expensive sales. Consequently, Latino and Black communities see less of a benefit of inflated property values due to fewer REAs in their areas. Due to systemic and historic reasons the housing market does not value Latino and Black neighborhoods as much as white ones.
DOI: 10.1111/ijsw.12425 Int J Soc Welfare 2020: 29: 321–334
This article takes a look at how positive environmental change in neighborhoods leads to gentrification (read: a rise in property values) and then pushes out lower income and BIPOC folks. “. For example, environmental gentrification refers to situations in which the cleanup of contaminated land or the installation of environmental amenities intentionally or unintentionally catalyzes increased housing costs, thereby contributing to the displacement of vulnerable residents.” The paper discusses the responsibility neoliberals and city planners have to ensure they do not forget focusing on social and economic equality when revitalizing areas environmentally as the increase in housing values of these lower income areas pushes out lower income folks.
Department of Economics, Carleton College, One North College St, Northfield, MN 55057. Journal of Regional Science, VOL. 55, NO. 4, 2015, pp. 644–670
This paper discusses the negative relationship between traffic noise and real estate prices. “The noise level at which such effects are observed does not have to be high. It has been shown that people exposed to traffic noise with a 24-hour average of 55 decibels (dBA) are found to be at a higher risk for hypertension (Barregard, Bonde, and Ohrstrom, ¨2009; Bodin et al., 2009), and those exposed to 60 dBA or greater are found to be at a higher risk for stroke (Sørensen et al., 2011).” Paper investigated 40k households in the Twin Cities, MN and correlated housing values to decibel levels recorded by noise recording machines. The paper uses a standardized noise model equation that the state of Minnesota uses to calculate noise levels. Additionally, they obtained the aircraft routes through Twin Cities airspace and scheduling to find an average noise level contributed via aircraft.
The Independent Review: VOLUME 13, NUMBER 4, SPRING 2009
This paper argues that unnecessary land restriction usage led to a shortage of affordable housing (read: real estate was too expensive due to artificial restriction of land availability). Edwin argues the government should ease up restrictions and take a backseat when it comes to mortgage support to prevent excessive risk taking. This paper argues that rising costs and inflation of housing prices cannot be attributed to construction costs as construction costs have been consistently falling since 1980 due to better and cheaper materials, increased use of machinery, increased employment of migrant labor, and shifting of shifts from on site construction to prefabricated parts at a factory. Overall, paper argues that outdated and confusing zoning regulations lead to huge initial costs of development, urban sprawl, and a lack of easy and affordable development for housing.
We are using the IPUMS USA Database to find economic data on American Households.
This database contains each US census data answer for each survey question. The data set we have compiled has 1,560,255 observations across 31 variables, where it is only from the year 2021. Our data set extract examines answers by household, where our variable of interest is VALUEH, which is a numerical variable that tracks census-taker household estimates on the value of their home. To answer our question to find partial effects that affect housing values, we will regress VALUEH on many variables to find relevant drivers of housing values. Below is an overview of the variables of interest.
Variable Name | Variable Description |
---|---|
BUILTYR2 | The age of the structure |
CITYPOP | City population |
DENSITY | Population Density |
FARM | Is the property a farm: 1=non-farm, 2=farm |
HHINCOME | Household Income |
I(NFAMS**2) | Number of families living inside a household squared |
I(ROOMS**2) | Number of rooms in a house squared |
METRO | Metropolitan status of household: 0=metropolitan status indeterminable, 1= not in metropolitan area, 2= In center/principal city, 3= Not in central/principal city, 4= central/principal city status indeterminable |
NFAMS | Number of families living inside a household |
ROOMS | Number of rooms in a house |
UNITSSTR | Number of housing units in each structure |
MULTGEN | Number of generations in that household |
# import packages
import wooldridge as woo
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from stargazer.stargazer import Stargazer
from IPython.core.display import HTML
import statsmodels.formula.api as smf
from statsmodels.stats.outliers_influence import variance_inflation_factor
import statsmodels.formula.api as smf
import scipy.stats as sci
import statsmodels.stats.api as sms
from linearmodels.iv import IV2SLS
# import data set
housing = pd.read_csv(r"C:\Users\jerry\OneDrive - Emory University\AtlantaJunior\Spring 2023\ECON 320\LAB\Data\usa_00002.csv")
Initialy, our data set has roughly 1.5 million observations, so we are filtering any abnormal values that may affect our overall analyses.
# filtering and cleaning strange observations
housing1 = housing.drop(housing[(housing['VALUEH']==9999998) | (housing['VALUEH']==9999999) | (housing['VALUEH']==0000000)].index)
# listing some of our variables of interest
varofint = housing1[["VALUEH", "METRO", "HHINCOME", "NFAMS", "ROOMS", "BUILTYR2"]]
varofint.corr
sns.pairplot(varofint)
<seaborn.axisgrid.PairGrid at 0x2497f06df90>
sns.barplot(data=housing1, x='MULTGEN',y='VALUEH').set(xlabel='Number of Generations', ylabel='Value (USD)', title='Housing Values Based on Number of Generations')
[Text(0.5, 0, 'Number of Generations'), Text(0, 0.5, 'Value (USD)'), Text(0.5, 1.0, 'Housing Values Based on Number of Generations')]
From the figure above, we see that the housing value is generally higher as the number of generations living in that house increases.
sns.countplot(data=housing1, y='METRO').set(title = 'Count of Houses Based on Metro Status')
[Text(0.5, 1.0, 'Count of Houses Based on Metro Status')]
From the figure above, houses are commonplace in metropolitan areas, but are not necessarily within the central/principal city.
sns.set(rc={'figure.figsize':(12,8)})
sns.heatmap(pd.crosstab(housing1['FARM'], housing1['METRO'], values=housing1['VALUEH'], aggfunc='mean')).set(title='Average Housing Value Based on Farm and Metro Status')
[Text(0.5, 1.0, 'Average Housing Value Based on Farm and Metro Status')]
sns.heatmap(pd.crosstab(housing1['MORTGAGE'], housing1['METRO'], values=housing1['VALUEH'], aggfunc='mean')).set(title='Average Housing Value Based on Mortgage and Metro Status')
[Text(0.5, 1.0, 'Average Housing Value Based on Mortgage and Metro Status')]
In both heatmaps, houses that reside in metropolitan areas generally are higher in value regardless of mortgage or farm status. Interestingly, they do not necessarily reside in the principal/center city, which is contrary to our intial belief that houses that are within the central city are valued the most.
Below is a preliminary regression of the housing value regressed with our variables of interest.
initreg = smf.ols(formula= 'np.log(VALUEH) ~ METRO + HHINCOME + NFAMS + ROOMS + BUILTYR2 + FARM', data= housing1).fit()
We include these variables and our rationale behind them:
Year Built: The recency of a house’s build date should increase its value
City Population, Population Density: These Population indicators were included to see if larger cities correlate with more valuable homes
Farm: This variable tries to capture the value of a farm
Number of Families, NFAMS^2: This variable should show that homes with larger capacities have more value, this could be confounded by multigenerational housing trends from lower income, immigrant communities
Rooms, Rooms^2: This should be a major determinant of value, as the number of rooms in a house definitely increases its value as more area / living space is added in, this is a good stand in for area.
Housing Units: This is a metric of homes that are rented, as buildings with multifamily capabilities can fit more people, and thus increase value
Metro Category (treatment of smallest metro): This Categorical variable should capture some of the effect of location on a house’s value, as we hypothesize that larger cities / towns have higher demand for homes, increasing valuations
Looking at our initial regression on the equation:
$$ VALUEH = \beta_0 + \beta_1*METRO + \beta_2*HHINCOME + \beta_3*NFAMS + \beta_4*ROOMS + \beta_5*BUILTYR2 + \beta_6*FARM $$We find that there is partial effect based on the variables we chose, but it suffers from a bad R^2 and has multicollinearity, we must refine, recode, and clean our data to get any tangible conclusions from our statistical analyses. Our initial model definitely suffers from many misspecification, omitted variables, and significance errors, but hopefully we will be able to improve it. We will address mutlicolinearity by creating non-linear regressors mainly the variables NFAMS and ROOMS.
In addition, we are incorporating other regressors like CITYPOP and DENSITY as they may have possible contributions to the housing value. One of the variables that might be important is the square footage of the house. This can be a key predictor especially in metro areas since price per square foot is consistently priced in these areas. Another is room size the rooms are not uniform in size. For example, a master bedroom is considerably larger than an office. These two variables are not included in our data set, so you we cannot incorporate them in our model.
Finally, given most of our regressors are primarily categorical, we recode them so they model can distinguish each categorical value. We will a joint significant hypothesis test where null hypothesis suggests that each of the betas are zero, implying they do not have a significant role in determining the value of the house. The alternative hypothesis is that at least one of the regressors plays a significant role in determining the value of the house.
runreg = smf.ols(formula= 'np.log(VALUEH) ~ METRO + HHINCOME + NFAMS + ROOMS +'
'I(ROOMS**2) + BUILTYR2 + FARM', data = housing1).fit()
r3 = smf.ols(formula= 'np.log(VALUEH) ~ METRO + HHINCOME + NFAMS + I(NFAMS**2) + ROOMS +'
'I(ROOMS**2) + BUILTYR2 + FARM', data = housing1).fit()
r4 = smf.ols(formula= 'np.log(VALUEH) ~ METRO + HHINCOME + NFAMS + I(NFAMS**2) + ROOMS +'
'I(ROOMS**2) + BUILTYR2 + FARM + UNITSSTR', data = housing1).fit()
r5 = smf.ols(formula= 'np.log(VALUEH) ~ METRO + HHINCOME + NFAMS + I(NFAMS**2) + ROOMS + '
'I(ROOMS**2) + BUILTYR2 + FARM + UNITSSTR + DENSITY', data = housing1).fit()
r6 = smf.ols(formula= 'np.log(VALUEH) ~ METRO + HHINCOME + NFAMS + I(NFAMS**2) + ROOMS + '
'I(ROOMS**2) + BUILTYR2 + C(FARM) + UNITSSTR + DENSITY + CITYPOP', data = housing1).fit()
i2s = Stargazer([initreg, runreg, r3, r4, r5, r6])
HTML(i2s.render_html())
Dependent variable:np.log(VALUEH) | ||||||
(1) | (2) | (3) | (4) | (5) | (6) | |
BUILTYR2 | 0.028*** | 0.029*** | 0.029*** | 0.032*** | 0.038*** | 0.038*** |
(0.000) | (0.000) | (0.000) | (0.000) | (0.000) | (0.000) | |
C(FARM)[T.2] | 0.196*** | |||||
(0.008) | ||||||
CITYPOP | -0.000*** | |||||
(0.000) | ||||||
DENSITY | 0.000*** | 0.000*** | ||||
(0.000) | (0.000) | |||||
FARM | 0.132*** | 0.131*** | 0.131*** | 0.150*** | 0.196*** | |
(0.009) | (0.009) | (0.009) | (0.008) | (0.008) | ||
HHINCOME | 0.000*** | 0.000*** | 0.000*** | 0.000*** | 0.000** | 0.000** |
(0.000) | (0.000) | (0.000) | (0.000) | (0.000) | (0.000) | |
I(NFAMS ** 2) | -0.026*** | -0.026*** | -0.022*** | -0.022*** | ||
(0.002) | (0.002) | (0.002) | (0.002) | |||
I(ROOMS ** 2) | -0.004*** | -0.004*** | -0.007*** | -0.008*** | -0.008*** | |
(0.000) | (0.000) | (0.000) | (0.000) | (0.000) | ||
Intercept | 10.839*** | 10.618*** | 10.526*** | 9.628*** | 9.647*** | 9.845*** |
(0.011) | (0.012) | (0.014) | (0.014) | (0.014) | (0.011) | |
METRO | 0.164*** | 0.164*** | 0.164*** | 0.146*** | 0.137*** | 0.136*** |
(0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | |
NFAMS | 0.062*** | 0.063*** | 0.180*** | 0.179*** | 0.138*** | 0.138*** |
(0.004) | (0.004) | (0.009) | (0.008) | (0.008) | (0.008) | |
ROOMS | 0.109*** | 0.170*** | 0.170*** | 0.240*** | 0.253*** | 0.253*** |
(0.000) | (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | |
UNITSSTR | 0.193*** | 0.135*** | 0.135*** | |||
(0.001) | (0.001) | (0.001) | ||||
Observations | 933,339 | 933,339 | 933,339 | 933,339 | 933,339 | 933,339 |
R2 | 0.124 | 0.125 | 0.125 | 0.167 | 0.195 | 0.195 |
Adjusted R2 | 0.124 | 0.125 | 0.125 | 0.167 | 0.195 | 0.195 |
Residual Std. Error | 1.015 (df=933332) | 1.014 (df=933331) | 1.014 (df=933330) | 0.990 (df=933329) | 0.973 (df=933328) | 0.973 (df=933327) |
F Statistic | 22025.170*** (df=6; 933332) | 19088.069*** (df=7; 933331) | 16736.935*** (df=8; 933330) | 20811.403*** (df=9; 933329) | 22619.117*** (df=10; 933328) | 20579.510*** (df=11; 933327) |
Note: | *p<0.1; **p<0.05; ***p<0.01 |
r1= smf.ols(formula= 'np.log(VALUEH) ~ C(METRO) + HHINCOME + NFAMS + ROOMS + '
'BUILTYR2 + FARM', data= housing1)
r2 = smf.ols(formula= 'np.log(VALUEH) ~ C(METRO) + HHINCOME + NFAMS + I(NFAMS**2) + ROOMS + '
'I(ROOMS**2) + BUILTYR2 + C(FARM) + UNITSSTR + DENSITY + CITYPOP', data = housing1)
initresults = r1.fit()
robinitresults = r1.fit(cov_type="HC3")
resr6 = r2.fit()
robr6 = r2.fit(cov_type="HC3")
i3s = Stargazer([initresults, robinitresults, resr6, robr6])
HTML(i3s.render_html())
C:\Users\jerry\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\statsmodels\base\model.py:1871: ValueWarning: covariance of constraints does not have full rank. The number of constraints is 14, but rank is 13 warnings.warn('covariance of constraints does not have full '
Dependent variable:np.log(VALUEH) | ||||
(1) | (2) | (3) | (4) | |
BUILTYR2 | 0.033*** | 0.033*** | 0.038*** | 0.038*** |
(0.000) | (0.000) | (0.000) | (0.000) | |
C(FARM)[T.2] | 0.246*** | 0.246*** | ||
(0.008) | (0.009) | |||
C(METRO)[T.1] | -0.073*** | -0.073*** | -0.069*** | -0.069*** |
(0.004) | (0.004) | (0.004) | (0.004) | |
C(METRO)[T.2] | 0.967*** | 0.967*** | 0.523*** | 0.523*** |
(0.004) | (0.005) | (0.005) | (0.006) | |
C(METRO)[T.3] | 0.655*** | 0.655*** | 0.538*** | 0.538*** |
(0.003) | (0.003) | (0.003) | (0.003) | |
C(METRO)[T.4] | 0.580*** | 0.580*** | 0.467*** | 0.467*** |
(0.003) | (0.003) | (0.003) | (0.003) | |
CITYPOP | -0.000*** | -0.000*** | ||
(0.000) | (0.000) | |||
DENSITY | 0.000*** | 0.000*** | ||
(0.000) | (0.000) | |||
FARM | 0.238*** | 0.238*** | ||
(0.009) | (0.009) | |||
HHINCOME | -0.000 | -0.000 | 0.000 | 0.000 |
(0.000) | (0.000) | (0.000) | (0.000) | |
I(NFAMS ** 2) | -0.021*** | -0.021*** | ||
(0.002) | (0.008) | |||
I(ROOMS ** 2) | -0.008*** | -0.008*** | ||
(0.000) | (0.000) | |||
Intercept | 10.672*** | 10.672*** | 9.912*** | 9.912*** |
(0.011) | (0.011) | (0.011) | (0.026) | |
NFAMS | 0.036*** | 0.036*** | 0.124*** | 0.124*** |
(0.004) | (0.005) | (0.008) | (0.032) | |
ROOMS | 0.110*** | 0.110*** | 0.246*** | 0.246*** |
(0.000) | (0.000) | (0.002) | (0.002) | |
UNITSSTR | 0.129*** | 0.129*** | ||
(0.001) | (0.001) | |||
Observations | 933,339 | 933,339 | 933,339 | 933,339 |
R2 | 0.165 | 0.165 | 0.208 | 0.208 |
Adjusted R2 | 0.165 | 0.165 | 0.208 | 0.208 |
Residual Std. Error | 0.991 (df=933329) | 0.991 (df=933329) | 0.965 (df=933324) | 0.965 (df=933324) |
F Statistic | 20452.078*** (df=9; 933329) | 20195.578*** (df=9; 933329) | 17500.132*** (df=14; 933324) | 15894.737*** (df=14; 933324) |
Note: | *p<0.1; **p<0.05; ***p<0.01 |
Our Final Regression Model to predict housing values is:
$$ log(VALUEH) = \beta_0 + \beta_1*Year Built + \beta_2* City Population + \beta_3*Population Density + \beta_4*Farm + $$$$ \beta_5* Number of Families + \beta_6*NFAMS^2 + + \beta_7*Rooms + \beta_8*Rooms^2 + \beta_9*Housing Units + $$$$ \beta_10*Metro Category (treatment of smallest metro)$$This is a log transformed model that estimates the percentage effects of each independent variables on the value of a home.
There are some quadratic terms, as we found outsized nonlinear effects on home value correlating specifically to Rooms and Number of Families living in the home. All beta effects on our log linear model are ceteris paribus effects.
$$ \beta_0 = 9.912 $$ This shows that the initial value of a house with no modifications is e^9.912, which equals to roughly 20,170. However, this value is significant as there are no houses that are built with zero rooms, families, etc.
$$ \beta_1*Year Built = 0.038 $$
This shows a 3.8% increase in value for every year the house has been built, showing recency increases value.
$$ \beta_2*City Population = -0.000 $$$$ \beta_3*Population Density = 0.000 $$
This Result is very interesting, looking at the full form of this Beta, it has a miniscule negative impact on home value, rounding to 0, but is still statistically significant. Would these hold predictive values in huge cities?
$$ \beta_4*Farm = 0.246 $$
This shows a 24.6% increase in value for properties that are NOT Farms, due to a category recode
$$ \beta_5 * NumberOfFamilies = 0.124 $$$$ \beta_6 * NumberOfFamilies^2 = -0.021 $$
This shows a 12.4% increase in value for every family that , but has eventual diminishing returns due to nonlinearity.
$$ \beta_7 * Rooms = 0.246 $$$$ \beta_8 * Rooms^2 = -0.008 $$
This shows a 24.6% increase in value for every year the house has been built, but has eventual diminishing returns due to nonlinearity.
$$ \beta_9 * Housing Units = 0.129 $$
This shows a 12.9% increase in value for each additional unit of housing that is part of the home.
$$ \beta_10 * Metro $$
These betas were dependent upon the category treatment of Metropolitan size, rural homes have a negative beta value, and value jumps for each larger metro category (small metro, suburb, urban)
In conclusion, the cost of housing in the United States varies widely by region, with supply and demand, population growth, job possibilities, and local economic conditions all having a role. Some areas, such as the West Coast and New York City, are notoriously expensive to live in, while others provide more affordable housing options. When deciding on housing, individuals and families must consider their budget and other financial factors. Tracking developments in house prices, on the other hand, is critical for investors, policymakers, and individuals alike. Understanding the trends in house prices can provide useful insights into the health of the housing market and the broader economy, as well as help advise key financial decisions. Rising or declining house prices can signal economic strength or weakness, and authorities may need to take action to avert a housing bubble or promote housing demand. Stakeholders may make informed decisions and capitalize on possibilities in the real estate industry by staying up to date on housing market trends. As a result, it is critical to continue monitoring and researching US housing price trends.
The purpose of the study is to determine the drivers of housing values. To achieve this goal, the approach taken was to build models using possible instruments/variables. The methods used in the study involved performing OLS regression upon possible explanatory variables and checking for increases in fit. The occurrence of data problems, particularly in the home value variable (VALUEH), which contains substantial outliers that may bias the results, is one weakness of the study. This could lead to mistakes in the findings and limit the study's generalizability. Furthermore, the study only considers a subset of variables and may not account for all factors influencing housing values. These limitations emphasize the need for additional research to address these difficulties and enhance understanding of the factors influencing home values.
We find that most all of our independent variables hold significance in determining housing value, although it only explains 20%, R^2 of .20 in final model, The study has found that most of the independent variables hold significance in determining housing value, although the final model only explains 20% (R^2 of .20) of the variation in housing prices. From this cross section of data, we find many statistically significant value drivers in housing prices, with a full regression yielding a statistically significant p-value on the F-test for joint significance of the regression.
The number of rooms and families living in the house were found to have square relationships, meaning that they increase the value of the house at a higher rate. This suggests that more occupants and rooms lead to more expensive housing. These findings provide important insights into the drivers of housing values and can be useful for individuals, businesses, and policymakers in making informed decisions related to the housing market. However, it is important to note that the model has its limitations, and future work could be done to improve its accuracy and overcome any potential shortcomings.
One interesting result from the study is that household income was found to be an insignificant predictor of housing value. This suggests that the relationship between income and housing value is not straightforward and may be influenced by other factors. The use of robust regression to address outliers highlights the importance of addressing data quality issues in statistical analysis. Overall, this finding underscores the complexity of the housing market and the need for careful consideration of multiple variables when trying to understand housing values.
In addition, the population of a city holds a miniscule negative value, and density holds a miniscule positive value. While not very economically significant, these Betas are still statistically significant to our model. It would be interesting to see if this causes predictive bias when applied to cities with massive populations outside of America (i.e. Beijing, Mumbai, Tokyo).
To address the issue of data inaccuracies and outliers in the VALUEH variable, further study could include gathering information on neighborhoods and affluence, as well as accounting for the influence of schooling and cost of living adjustments. This could increase the analyses' accuracy and provide a more thorough understanding of the factors influencing housing values.
Besbris, Max and Jacob William Faber. “Investigating the Relationship Between Real Estate Agents, Segregation, and House Prices: Steering and Upselling in New York State.” Sociological Forum, vol. 32, no. 4, December 2017
Mills, Edwin S. “Urban Land-Use Controls and the Subprime Mortgage Crisis.” The Independent Review, vol. 13, no. 4, Spring 2009
Swoboda, Aaron and Tsegaye Nega and Maxwell Timm. “HEDONIC ANALYSIS OVER TIME AND SPACE: THE CASE OF HOUSE PRICES AND TRAFFIC NOISE.” Journal of Regional Science, vol. 55, no. 4, pp. 664-670, 2015
Krings, Amy and Tania M. Schusler. “Equity in sustainable development: Community: responses to environmental gentrification.” Int J Soc Welfare, no. 29, pp. 321-334, 2020
Investopedia. “Why Housing Market Trends Matter.” Investopedia, 13 Jan. 2015, www.investopedia.com/articles/personal-finance/011315/why-housing-market-trends-matter.asp.
Protas, Marc. “The Importance of Tracking Home Price Trends.” Forbes, 24 July 2018, www.forbes.com/sites/marcprotas/2018/07/24/the-importance-of-tracking-home-price-trends/?sh=1cf112e435ec.
U.S. News & World Report. “Housing Market Trends: Why They Matter.” Real Estate | US News, U.S. News & World Report, realestate.usnews.com/real-estate/articles/housing-market-trends-why-they-matter.
Rocket Mortgage. “Why Keeping up With Housing Market Trends Matters.” Rocket Mortgage, Quicken Loans, 12 Oct. 2021, www.rocketmortgage.com/learn/housing-market-trends
National Association of Realtors (NAR). “Existing-Home Sales.” National Association of Realtors, www.nar.realtor/research-and-statistics/housing-statistics/existing-home-sales.
Zillow. “Monthly Housing Market Update.” Zillow Research, Zillow, www.zillow.com/research/data/.
Redfin. “Housing Market Reports.” Redfin, www.redfin.com/news/housing-market-news/.
Realtor.com. “Real Estate Market Data and Reports.” Realtor.com, Move Sales, Inc., www.realtor.com/research/data/