I'm currently reading 'The Alignment Problem' by Brian Christian which is a book about machine learning, leveraging his decades of experience.

He has a particular interest in the strengths of these models and the areas where we need to be cautious.

It's full of examples to make you think deeply.

One such example relates to a competition held in the mid 1990s by Pittsburgh hospital to find the best model to predict when patients with pneumonia should be treated as in-patients vs out-patients

There was only enough space to treat the sickest but roughly 10% of patients with pneumonia died and so making the right decision was one of life or death

The most successful model by far was a neural network, which looked at multiple factors and created a weighted probability model

But, knowing how the model reached its conclusions was opaque and so despite the fact that it appeared to be a way to save more lives, the doctors were concerned that if the model had made incorrect assumptions, they would never know

Some years later, the researcher who had created the winning model as a teenager tried again, building an explainable model to replicate the results and came to some startling conclusions

The model believed that:
- Patients with asthema should be treated as low risk because few died
- Similarly smokers
- Similarly those with heart problems

This would seem to fly in the face of conventional wisdom

But, the data was clear, these groups, as well as the overweight and those with other serious illnesses all appeared to be lower risk than the general population

The model was highly accurate based on history, so were these new insights?

The truth, you may have already guessed, was far simpler.

These groups were prioritised at the hospital and so received the greatest care. Spaces in ICU, additional medical support...

The model hadn't controlled for any of these factors and so had assumed that these groups were operating under the same conditions as all other patients.
Following the model without any changes would have deprioritised high risk patients and would have resulted in life threateningly bad decisions

But it's not just a modelling problem, because even knowing that this is a factor doesn't bring perfect insight

As soon as there's an intervention, the dataset is affected. What would have happened without additional support? Similarly in treasury, what would have happened if we hadn't asked for a large debtor to be chased or put on stop. You can never know for sure.

And part of the reason for a forecast is to tell you if there's going to be a problem. No one is going to plunge into a liquidity crisis to avoid tainting the data.. We borrow more or we arrange high level meetings with key debtors, we defer creditors...

And the data if we get it right... it shows that the issue never happened

Assuming the future is going to be the same as the past is not & never will be correct

This is a forecast issue, not a machine learning issue, because there's no such thing as a completely clean dataset...

So why do I like a healthy degree of paranoia in my models?

A good model provides insights and a version of the future

In general, negative shocks are the things that concern treasury teams most, so this is where I would focus.

A model that identifies the negative risk factors and flags them early is the ideal. This means using the existing data but also highlighting the existing risk factors means that when issues are identified and acted on, they are not ignored in the data but are treated as problematic.

Therefore, a model that is oversensitive to negative risk factors should provide more information about what could happen, helping us as risk managers...