It’d be great if someone wanted to write some code to add local unemployment data to the GitHub repo. Unemployment data could be used as another predictive variable, though based on personal experience with subprime mortgage default models, I’d guess that unemployment rate will not prove as predictive as home price-adjusted LTV

This video is processing – it'll appear automatically when it's done.

You can convert a monthly default rate to an annualized default rate with the following formula:

annualized_rate = (1 – (1 – monthly_rate) ^ 12)

The idea is that if monthly_rate fraction default every month, then (1 – monthly_rate) survive every month, so after 12 months you’d have (1 – monthly_rate) ^ 12 fraction of your original population remaining, and so 1 minus that quantity is the annualized default rate

This video is processing – it'll appear automatically when it's done.

Simple example: if a month started with 100 non-defaulted loans, and 4 of them defaulted in that month, then that month’s transition rate was 4 / 100 = 4%

This video is processing – it'll appear automatically when it's done.

The Phoenix and Las Vegas maps might look more interesting if Fannie and Freddie released detailed ZIP code data. Subprime mortgage data was typically available at the ZIP code level, and I remember from my days as a mortgage analyst that when you looked at a map of Las Vegas ZIP codes, you would see concentric circles, where the outermost exurban areas had the highest default rates, and as you got closer to the city center then default rates were progressively lower

This video is processing – it'll appear automatically when it's done.

Well, that’s not quite fair because I can’t fit the entire dataset into RAM, which would make it even faster to analyze the data. According to the website yourdatafitsinram.com, it would cost about $10,000 to buy a server with enough RAM to fit the whole dataset. $10k is certainly more than I’m going to spend, but it’s a drop in the bucket for any big company looking to analyze mortgage data

This video is processing – it'll appear automatically when it's done.

“Big Data” has become a meaningless cliché, so much so that complaining about Big Data being a cliché is also a cliché… so what’s the hipster data scientist to do? Talk about “medium data”, I suppose!

I like this definition of data-bigness, and when I say “medium data” I’m thinking of a dataset that can be stored on a single machine, but is big enough that you have to think before executing queries. In this particular case, some of my queries took up to a few hours – not an insanely long time, but long enough that it’s best to avoid having to rerun things if possible

This video is processing – it'll appear automatically when it's done.

Traditionally Fannie and Freddie have guaranteed the timely payment of principal and interest to MBS investors. This means that if an investor owns an agency mortgage bond, and some of the homeowners whose mortgages are part of the bond stop paying their loans, then Fannie and Freddie reimburse the mortgage investors for any losses.

In the new world of “potential risk-sharing initiatives”, investors themselves would take on at least some of the risk of loss should homeowners stop paying their mortgages. Before investors are willing to take that risk, though, it seems likely they would want to see historical data so that they can make some informed projections about future default rates for Fannie and Freddie loans.

This video is processing – it'll appear automatically when it's done.

what a pathetic excuse of a continent

This video is processing – it'll appear automatically when it's done.

$1.2 billion in revenue last year… so if ISIS were a startup it could get VC investment at around a $12 billion valuation?

This video is processing – it'll appear automatically when it's done.