White Paper

Applying Machine Learning to Optimize the Correlation of SecurityScorecard Scores with Relative Likelihood of Breach

Applying Machine Learning to Optimize the Correlation of SecurityScorecard Scores with Relative Likelihood of Breach

This paper describes how SecurityScorecard applied machine learning to retune the weights of its 10 rating factors so overall scores better align with the relative likelihood of a publicly disclosed data breach. Using a backtest across 2017–2019, it analyzed 99,076 continuously scored organizations and 2,228 eligible breaches, estimating breach timing with a 90-day discovery offset and averaging factor scores around the estimated breach date. The study used logistic regression with regularization (to limit large score swings), applied PCA to reduce collinearity, and validated results through 100 sampling trials balanced across organization size cohorts. The ML-tuned weights increased the correlation between grades and breach likelihood by 37%, with organizations graded F about 7.7±0.9 time

Join for free to read