Feature drift occurs when the properties or distribution of the input features change. In this scenario, the expected temperature, an input feature for the model, is dropping beneath the range seen during training, indicating a shift in the feature distribution.
To remove a column from a Delta table, the approach involves loading the table into a DataFrame and then utilizing the drop method to remove the specified column. The correct code snippet accomplishes this by using spark.read.format('delta').load(path).drop('star_rating'), which reads the Delta table at the specified path and drops the 'star_rating' column from the resulting DataFrame.
To return a Spark DataFrame of a data set associated with a Feature Store table, the correct operation is fs.read_table. This function retrieves data from the Feature Store and returns it as a Spark DataFrame.
After obtaining the observed (actual) label values, the next logical step is to compute the evaluation metric using the observed and predicted values. This allows the engineer to compare these values and assess the model's performance. If there are significant changes in the evaluation metric over time, it can indicate potential concept drift. Therefore, computing the evaluation metric is a crucial step before running a statistical test to determine if there are changes over time.
Jensen-Shannon (JS) distance produces a value between 0 and 1 that represents the divergence between two distributions. This value can be interpreted directly and doesn't require setting arbitrary thresholds or cutoffs. In contrast, the Kolmogorov-Smirnov (KS) test often involves determining a critical value based on the chosen significance level.