Understanding and using diagnostics defines a good statistician from a hack.  It is not enough just to run the diagnostics; you must challenge your model as a critical eye.  To build a model is simple, to assure it is stable and accurate is a work of art.  There are a variety of tools and test that can aid you in evaluating your model.  When doing diagnostics ever assume anything, always seek proof.


2. Tools for linear models



a. QQ plot



The QQ or quantile-quantile plot shows the residual errors for the first and last quantilths of the dependant variable plotted against a 45-degree line. This allows you to see how well the model fits at both the extreme. A model that fits poorly will appear to curl up on itself on the plot. Having a model that fits poorly at the extremes is not a good thing but oftentimes it is not a showstopper. By setting maximum allowable values for the model it can still be usefully in segmenting cases. To correct for poorly fitting tails look for new explanatory variables or double check to see if you missed any non-linarities that could be confusing the system.



b. Residual Plots


    By observing the residual plots much can be uncovered about how the model is performing.  A key thing to look for is any pattern in the plots. Since the residual should be random there should be no observable trend in the residual.  Any observable pattern indicates trouble with the model.



c. R-Squared


    R-Squared measures the proportion of the variation of the dependant variable explained (I am using that term very losely) by the model. R-Squared has poor standing amoung statitisica but can be useful if it is not the only measure of fittness of the model. It ranges from zero to one with one being a perfect fit. One is only possible if you include the dependant variable as an explaintory variable and therefore is an indication of error. With the data I typically look at a good model typically ranges from .1 to .3 however I have seen model in production working well with an R-Squared as low as .07

R^2 = 1- ((y-Xb)’(y-Xb) )/ sum(y-yBar)^2



d. MSE


    MSE or Mean Squared Error is useful in choosing between multiple models. It is simply the average of the squared errors.



e. 1. Partial Regression


    Partial regressions are an important tool in determine for the independent variables effect the model as well as themselves. It is the net effect of a independent variable correcting for other regressors.



e. 2. Partial Residual Plots


    Partial residual plots are residuals plotted against each independent variable s value. This shows how the residuals of the model vary as the value of the independent variable changes. This will uncover situations such as a variable at high values causing too great a variation in the model leading to high residuals. In this case you would cap the independent variable. Want you want to see is a even cloud of data points with a zero-slope centered on zero.



f. T-stats on Coefficients


    The T-statistics on the repressors test the null hypotheses that the coefficient is zero, that is has not effect on the model.  If you cannot statistically justify a variables inclusion into the model it is preferred to remove it.  Reasons for a variable failing a t-test can rage from it having no relation with the dependant variable, to non-lineariites influence the results or other independent variables clouding the true relationship. If there is firm theoretical reasons from including the variable investigate further. 



g. Economic Significance of Coefficients


    An independent variable may be statistically significant but have no explanatory power.  By calculating the economic significance of a variable you can roughly measure its contribution to the overall value of the dependant variable.  The Economic Significance of Coefficients is the coefficient times the standard deviation of the independent variables. There is no clear definition of whether a coefficient is economically significant instead a research has to look at the values and decide for herself whether a given coefficient has enough, well, oomph to be considered important. It is a powerful tool to rid yourself of those pesky statistically significant but unimportant variables.




    Cooks test is used to uncover outliners in the data. Using the Cooks value you can target outliers or removal. It should be remembered that not all outliers should be removed. Some are representative of important behavior in the system. In modeling weather in Florida hurricanes may look like a outlier in the data but they are a critical feature to model.


  i. CHOW


    The Chow test is used to test for structural or regime changes with in the data. In monetary and other financial models they are important test. If a structural change seldom occurs modeling a change using dummy variables can be a good choice but if structural changes occur often you may need to model the underline causes of those changes to have any chance of forecasting the process.


  j. Durbin-Watson


    Durbin-Watson (DW) is the standard test for serial correlation (autocorrelation). Remember, serial correlation violates BLUE and results in a bias model and you employ autoregression models to correct for it.  When investigating time series data you always have to be conscious of the DW statistic.


  k. Bag Plot


    Bag plots uncover outliers in the data and can are useful with Cooks Test.


  l. White


    Thie White test is a standard test for heteroskedasticity. Heteroskedasticity causes correlation coefficients to be biased downward.  This can lead to excluding relevant variables and biasing the coefficients downward.


3. Tools for Probablistic and Catorgical Models



a) Odds Ratios



The odds ratios for each independent variable indicate whether to keep that variable in the model. If the odds ratio is 1 that variable does not help the predictive power of the model while statistically significantly greater than or less than one indicates the variable has predictive power.



b) Receiver Operating Characteristic (ROC) Curve



The ROC curve is used to graphically show the trade off between Sensitivity (the true positive rate) and Specificity (the true negative rate). If the model has no predictive power all the point will lye n a 45 degree line.  A greater area between the 45 degree line and the ROC curve indicates a more predictive model.



c) Lorenze Curve


        The Lorenz curve is a rotated ROC curve. In other words, it is a plot of the cumulative percentage of cases selected by the model against the cumulative percentage of actual positive events.  Like with the ROC curven the area between the curve and the 45 degree line is called the Gini coefficient and is used to measure the fit of a model.  The higher the coefficient the better fitting the model.



d) Confusion Matrix



The confusion matrix shows the actual verse forecasted outcome for a binary or categorical process.


Yes No
Actual Yes a b
No c d

a: The number of times the model predicted Yes and the outcome was Yes.

b: The number of times the model predicted No and the outcome was Yes

c: The number of times the model predicted Yes and the outcome was No

d: The number of times the model predicted Yes and the outcome was Yes



e) Profit curve


    Uses if the model was used in production what its expected return would be, profit verse score.