Dan Gardev on Analytics, Data and Everything: Do You Test Your Models Well Enough?

Numeric models and complex calculations have the unsurprising property of being quite important most of the time. That puts a lot of pressure on the developer for delivering good results. The question is how do you know the model works properly? Considering the expense and the high impact it is surprising to me that the testing phase is often being skipped or not enough efforts are put assuring it adheres to high standards. Neglecting this phase often results in unsatisfied and angry customers, extra time spent on the project after delivery, or bad and costly decisions based on a bad model.

Generally speaking there are two parts in testing a model:
- Make sure the calculations are performed properly - challenge "the mechanics" of the model;
- Make sure the model is doing a good job - challenge "the brain" behind it;
Each of these parts can be described with a simple question:

The Mechanics: Am I Sure There is No Error In the Calculations?
This is the part where we have to make sure the calculations are properly made - if the formulas are built and set properly. This includes making sure that:
- The correct files, ranges and cells are referred;
- All the logical cases are covered in operations involving if-then statements;
- The right formulas are applied;
- The formulas work properly;
- Model handles well all the extremes of the input;
For example, we need to make sure that market share calculation refers to the volume and size of the correct market, the total for revenue includes all the revenue sources, or the look-up function returns the correct values.

The Brain: Does The Model Do A Good Job?
This part involves revising the model to gain confidence of its quality and includes answering a few questions some of which include:
- How reasonable are the results (sanity check) - e.g.- If the growth rate of a market has been 20% for the last 5 years, how likely is the growth for the next 5 years to be as the model predicts? Does the model forecast extraordinary results?
- How does my model compare against similar external models - sometimes other models, reports, or articles could be a reference point to test the model outputs. Word of warning!: be careful with comparing your model against other models coming from within your organization or your previous models because you could run into a systematic error. If you or the organization have a wrong understanding of any aspect of the forecasting in general, this error could be reproduced over and over again.
- Do you have a reasonable explanation of the outcomes - models often reveal unexpected behavior of our study subjects. Quite often an unexpected outcome is just a bad outcome. If something does not make sense at all then it would pay a great deal to carefully investigate it.

Which One Should Be Done First?
There is no rule of thumb about which one comes first. The application has to be flexible and depends on the specifics of project, developer, etc. In my experience we do the two parts at the same time.

Test Cases
Proper testing requires sound understanding of the model - the key points in the calculation chain, the range of the input parameters and the logic of the model. The best way to achieve a good test discipline and cover all the possible scenarios is to design a set of test cases. These test cases could be performed during the development or after everything is set up. The crucial step before we start designing the testing cases is to identify all the points to monitor, the ranges of the input parameters , and other possible values outside of it - too big, too small, empty, zeroes, etc. A simple guideline for designing the test cases includes:
- Target all the identified values of input parameters;
- Isolate the key calculation steps to make sure to easily and clearly catch any problem that may occur;
- Built the cases in increasing level of complexity - cover simplest cases first to make sure basics work well then move to including more and more elements to detect any problems that come from working of two or more factors together.

It is a great helps to build a designated output or a log to easily track what is going on and spot problems.

Last Words
Flexibility is key to the tests. Decide what best suits you, the model and the project. The take-away is to keep asking yourself the two simple questions about the model quality and not to neglect the testing phase. I promise it will pay back!

Dan Gardev on Analytics, Data and Everything

Mar 28, 2013

Do You Test Your Models Well Enough?

No comments:

Post a Comment