Covid-19 Models – Quit Complaining!

Predictive Modeling is not easy. Nor for the faint of heart.

Chinese Coronavirus Crap

Alright, it is time for a rant, so here goes.

I am “friggin” fed up with all the websites and commenters across the Internet up in arms and claiming that the models used to predict Covid-19 cases and deaths were frauds and part of a much larger conspiracy. “Gateway Pundit, quit it now!”

Back during the Housing  Crisis beginning in 2008, I did foreclosure analysis of over 5000 loans. In all of the loans, I noticed a “commonality” existing among foreclosures that did not exist with loans that continued to perform. Further research found that the commonality existed at loan origination for most who did not experience job loss. So I began to develop a model for determining what loans might default and when.

Part of my modeling led to the receipt of loan data sets from Corelogic to further pursue my research. In all, there were over 100k loan data sets to review, fortunately on Excel worksheets. Then I built the model to check things out.  Sure enough, I was correct…….but little did I understand that the model was designed to prove my “hypothesis” and not to try and prove the “null hypothesis.” In other words, one must try to break the model to show it does not break down under all conditions. If it does not, it is a valid model. No wonder that lenders wanted nothing to do with it.

Fast forward two years later. This time, I was involved in a project with two other main people and several experts to call upon as needed. Once again, this was foreclosure related.

One of my associates had a PhD in Applied Mathematics. In one meeting, the subject of my own work came up and he asked to see what I had done. After a couple of explanations of what I had done, he realized what I had stumbled onto. But he saw the problems with the models as well. So that began a process of developing a true model that worked.

It took one full year of work to come up with a predictive model that did not “break” under the Null Hypothesis. The first part of the project was to determine what factors could affect or cause foreclosures. Each time, we ran the factors through R Statistical Analysis software. The factors were run individually and collectively, looking for patterns.

During this time, we analyzed over 8 million loan data sets representing Fannie loans, Freddie loans, FHA loans and then Private Mortgage Label loans. The work was extensively documented, with the results reviewed by the PhD partner. He tore the stuff apart and would then say “we now have to do this” or “this has no validity so on to the next.”

Eventually, we were able to prove the Null Hypothesis, which meant that we had a working model we could not break. And yes, it was what I had discovered.

This work took 18 months to accomplish. 18 months to prove that my discovery was correct. 18 months when we knew what we were looking to prove.  18 months without constantly changing data.

People are complaining about the predictive results of the Corid-19 modeling and how it changes daily. Well, “Excuse Me!”

The Covid-19 modelers are working under unbelievable parameters. Problems include:

  1. Data sets from China and Iran are totally suspect. Most likely lies so they cannot be considered with any reliability.
  2. Italy data sets offer different demographics than that of the US, so their use must be seriously adjusted, even if correct.
  3. Mechanisms by which Covid-19 works are not fully understood. This introduces instability into the model.
  4. There has been no testing available that can fully access the spread of Covid-19 and who has been exposed, had symptoms or worse which introduces more uncertainty.
  5. We don’t even know when the outbreak started so total cases cannot be adequately determined.
  6. Each state has its own set of demographics and so cannot be part of a truly inclusive database. Additionally, each state has developed different responses at different times to stop the spread of Covid-19.
  7. The data available for use in assessing and predictive modeling of the Covid-19 outbreak changes daily, with each change requiring updating of the model.

This is why the “predictions” from the Covid-19 modeling is changing on a daily basis. It is not part of a grand conspiracy! It is not incompetent people just guessing or trying to pull the wool over our eyes! Instead, it the nature of predictive modeling in a fluid and changing environment that causes the ever changing results.


Rant over.


Written by PatrickPu

Former Loan Officer and currently a Case Consultant and Expert Witness in Foreclosure and Lending Litigation cases. Avid follower of NCAA Football and Top 25 teams.


