« Back to Glossary Index

Data drift is led in dynamic and non-stationary environments, where data distributions can change over time. Data Drift refers to the variation in the statistical properties of data such that, considering two different time points, their distributions are different. In the context of a machine learning model, if the distribution of features changes without affecting that of the target, this is known as Virtual Drift.

A typical example of data drift is when a news reader cancels an online newspaper subscription because of the increasing price of the service. In this context, the price is a significative feature and its variation may have an impact on the subscribers number. If despite the increasing price readers continue the subscription, without any effect on the subscribers number, this indicates a Virtual Drift.