« Back to Glossary Index

In the rapidly evolving world of artificial intelligence (AI), maintaining the accuracy and reliability of predictive models is paramount. AI models are typically trained on extensive datasets that reflect the conditions and variables of a particular moment in time. However, as time progresses, the underlying data distributions can shift, causing model performance to degrade. One effective tool for monitoring and managing this shift is the Population Stability Index (PSI).

What is the Population Stability Index (PSI)?

The Population Stability Index (PSI) is a statistical metric used to measure the stability of a variable’s distribution over time. Specifically, PSI compares the distribution of a variable in two different moments. Usually, this method is used to compare the statistical properties of training data with the most recent one (e.g. production data) to verify if their distribution changes over time.

Why is PSI Important?

To illustrate the importance of PSI, consider an AI model trained to predict loan defaults based on historical customer data. If the economic environment changes or the loan application process undergoes modifications, the distribution of factors that influence loan defaults may shift. In such cases, a model trained on outdated data might generate inaccurate predictions. PSI is crucial because it detects these changes, known as “concept drift,” allowing data scientists to intervene before the model’s performance is significantly impacted.

How Does PSI Work?

PSI employs the concept of relative entropy, or Kullback-Leibler (KL) divergence, to measure the difference between two probability distributions. Here’s a detailed breakdown of the PSI calculation process:

Data Preparation:

  1. Select the Variable: Identify the variable to be monitored for stability.
  2. Data Binning: Divide both the reference data and current data into discrete intervals or bins to create a distribution.

Distribution Comparison:

  1. Calculate Percentages: Determine the percentage of data points in each bin for both the training and scoring datasets.

PSI Calculation:

  1. Apply the Formula: Use the PSI formula to compare the percentage differences between corresponding bins in the two distributions.
  • Expected Percentage: Percentage of data points in a bin for the training dataset.
  • Actual Percentage: Percentage of data points in the same bin for the scoring dataset.
  1. Interpret the PSI Score: The resulting PSI score quantifies the overall discrepancy between the training and scoring data distributions.

Interpreting PSI Scores

Interpreting PSI scores requires understanding the thresholds that indicate various levels of data drift:

  • Low PSI (less than 0.1): The distributions of the training and scoring data are similar, indicating stable model performance.
  • Moderate PSI (0.1 to 0.25): There is potential data drift, warranting further investigation.
  • High PSI (above 0.25): Significant data drift is likely, suggesting the need for model retraining or adjustment.

Benefits of Using PSI

  1. Early Detection of Model Drift: PSI allows for the early identification of changes in data distributions, enabling proactive intervention before these changes adversely affect model performance.
  2. Improved Model Performance: By monitoring and addressing data drift promptly, models can maintain high accuracy and reliability.
  3. Enhanced Decision-Making: Understanding data drift helps businesses make informed decisions regarding model retraining and data collection strategies.
  4. Flexibility: One of the key advantages of the PSI algorithm is its versatility: it can be applied to both categorical and numerical features.

Limitations of PSI

While PSI is a valuable tool, it does have certain limitations:

  1. Threshold Dependence: There is no universally accepted threshold for a “good” PSI score. The appropriate threshold may vary depending on the specific model and its application.
  2. Variable Selection: PSI analyzes one variable at a time. A comprehensive monitoring approach should consider multiple variables to get a holistic view of model stability.

Practical Considerations

When implementing PSI, consider the following best practices:

  • Regular Monitoring: Continuously monitor PSI scores to promptly detect and address data drift.
  • Threshold Customization: Customize PSI thresholds based on the specific context and requirements of your AI models.
  • Comprehensive Analysis: Use PSI in conjunction with other metrics and tools to gain a more complete understanding of model performance.
  • Selecting the proper number of bins: Splitting the data with the appropriate number of bins is crucial to identify even the slightest statistical changes.

Case Study: PSI in Financial Services

In the financial services industry, AI models are extensively used for credit scoring, fraud detection, and risk management. Consider a bank that uses an AI model to predict loan defaults. The model was trained on historical data that includes variables such as credit score, income, and employment status.

Over time, economic conditions change, leading to shifts in the distributions of these variables. By regularly calculating PSI for each variable, the bank can detect when the data the model encounters in production deviates from the training data. If the PSI score for credit score distribution exceeds 0.25, the bank knows that the model’s predictions may no longer be reliable, prompting a review or retraining of the model with more recent data.

Conclusion

The Population Stability Index (PSI) is a powerful tool for AI practitioners and data scientists. By measuring the stability of data distributions over time, PSI provides critical insights into the health and performance of AI models. As AI continues to play an integral role in various industries, maintaining model reliability through proactive monitoring of PSI becomes increasingly important. Understanding and effectively utilizing PSI can ensure that AI models remain accurate, reliable, and robust in the face of evolving data landscapes.