Tuesday, April 24, 2012

Housing Price Distribution

So one of the issues with the crash of 2008 in regards to housing prices was the inability to correctly calculate the risk of the mortgages (or, more specifically, the mortgage-backed securities). This is obviously an extreme simplicity, but from the books I've read, it appears that many risk management mathematics was based off the assumption of a normal distribution of housing price changes. In 2008, however, it appears that housing price changes were veering too far negative.

So I downloaded the raw housing data from each quarter from 1970 to 2011 from http://www.jparsons.net/housingbubble/ and performed some very simplistic analysis using Matlab.

I calculated the change in inflation-adjusted housing prices from each quarter, resulting in 167 values.

Here is a histogram of the change in housing prices from 1970 to 2007 with 15 bins (15 unique x-axis values):



Here is a histogram of the change in housing prices from 1970 to 2011 with 15 bins (15 unique x-axis values):



You can clearly see that from 2007 to 2011, the distribution looks more skewed.

I then performed a Chi-squared goodness of fit test. The output of the test "p" tells you the probability that the distribution is a normal distribution. But be careful here. You can only confidently say that the distribution is not normal when p < 0.05. Otherwise you can't really say much.

Here are the results, depending on the number of bins:

10 bins:
p (1970 to 2007): 0.2097
p (1970 to 2011): 0.0766


15 bins:
p (1970 to 2007): 0.5536
p (1970 to 2011): 0.0021


20 bins:
p (1970 to 2007): 0.2895
p (1970 to 2011): 0.1234


25 bins:
p (1970 to 2007): 0.6482
p (1970 to 2011): 0.2923


So it wasn't unreasonable to assume a normal distribution of housing price changes. Ideally the results should be more granular, and it's a very limited dataset (only 167 values).

The code used to do this was:

[~, p] = chi2gof( quarterly_change(1:150), 'nbins', nbins )
[~, p] = chi2gof( quarterly_change(1:167), 'nbins', nbins )

The lowest value was Q1 of 2008 in which housing prices dropped 7.8% in one quarter. If a normal distribution was used from 1970-2007, then dropping 7.8% in one quarter was 5.4 standard deviations from the mean. Put another way, if we assumed a normal distribution, the chance of getting a drop in prices of 7.8% or greater was .00000002.
normcdf( -.0784,  mean(quarterly_change(1:150)),  std( quarterly_change(1:150) ) )
  ans = 1.9682e-008
Put another way, the chance of it dropping between 0% and 7.8% and was 39.8%.

  normcdf( 0,  mean(quarterly_change(1:150)),  std( quarterly_change(1:150) ) )  ...
  - normcdf( -.0784,  mean(quarterly_change(1:150)),  std( quarterly_change(1:150) ) )
  ans = 0.39804