This article helps users change the data drift default settings at the model level.
1. Sign In to your deploy
2. Once logged in, select Model Monitor on the left side.
3. Once you click on it, it will open a new tab. Select the Settings .
4. On the right side click on TEST DEFAULTS. It would automatically bring you to Data Drift Test Defaults.
There are 3 statistical test you can configure (see Test Defaults ) :
A. Kullback–Leibler Divergence (Recommended)
Kullback–Leibler divergence (also called relative entropy) is a measure of how one probability distribution is different from a second, reference probability distribution.
The divergence can range from zero to infinity. A value of zero means there is no difference between the data sets.
This is a robust test that works for different distributions and therefore is most commonly used to detect drift.
B. Chi-square Statistic
Chi-square test in another popular divergence test well-suited for categorical data.
The chi square statistic is a statistical hypothesis testing technique to test how two distributions of categorical variables are related to each other. Specifically, the chi-square statistic is a single number that quantifies the difference between the observed counts versus the counts that are expected if there was no relationship between the variables at all.
The divergence can range from zero to infinity. A value of zero means there is no difference between the data sets.
C. Population Stability Index
Population Stability Index (PSI) is a popular metric in the finance industry to measure changes in distribution for two datasets. It produces less noise and has the advantage of a generally accepted threshold of 0.2-0.25.
5. Make sure you save the values you have set at the bottom of the screen
Comments
0 comments
Please sign in to leave a comment.