Question 18 of 24
What is our process for model drift monitoring?
Defining ownership and cadence for ongoing monitoring of deployed AI models to detect performance degradation, behavioral shifts, and emerging bias after deployment.
Compliance does not end at deployment
A model that passes all pre-deployment testing may behave differently in production as the world changes around it. Data drift occurs when the statistical properties of the inputs change over time. Concept drift occurs when the relationship between inputs and outputs changes, often because the underlying phenomena the model was trained to predict have evolved. Both can cause a previously compliant model to become inaccurate, biased, or harmful without any change to the model itself.
Regulatory frameworks including the EU AI Act and NIST AI RMF explicitly address post-deployment monitoring requirements. For high-risk systems, the EU AI Act requires post-market monitoring plans and logging of system operation. The regulatory expectation is that governance is ongoing, not a one-time pre-deployment exercise.
What to monitor and how often
Monitor prediction distribution: are the model's outputs shifting over time? A credit scoring model that is approving an increasing proportion of applications, or a hiring model that is rejecting more candidates in a particular category, may be exhibiting drift. Statistical process control methods can identify when output distributions move outside expected ranges.
Monitor input data quality: are the features the model relies on behaving as expected? Missing values, out-of-range inputs, and distribution shifts in key features are early indicators of drift. Monitor performance metrics against ground truth where available, including accuracy, precision, recall, and fairness metrics by subgroup.
Establish monitoring cadences proportionate to risk. High-risk systems in rapidly changing environments may require weekly or even daily monitoring. Lower-risk systems in stable environments may be adequately served by monthly or quarterly reviews.
Ownership and response protocols
Assign explicit ownership for post-deployment monitoring to a named role or team. Without ownership, monitoring activities are consistently deprioritized in favor of new deployments. The owner is responsible for running monitoring checks, reviewing results, escalating anomalies, and initiating retraining or retirement processes when drift is detected.
Define response thresholds in advance: at what level of detected drift does the system get flagged for review? At what level is it paused pending investigation? At what level is it retired? These thresholds should be calibrated to the risk level of the system and documented before deployment, not determined on the fly when something goes wrong.
