CauseRankのメトリクスの異常検知コード

ChatGPTにより生成された[[2022__CCGrid__Generic and Robust Performance Diagnosis via Causal Inference for OLTP Database Systems|CauseRank]]の[[Z-score]]に基づく異常検知のコード。 # me Can you code the below algorithm in Python? For a certain infrastructure metric group (denoted as X), CauseRank measures the fluctuation of each metric x in X compared with its normal state to evaluate how suspicious X is. An effective and widely used method (known as n- sigma) is to model the normal state of x with its mean μx and standard deviation σx. CauseRank estimates the anomaly degree of x for each data point x(t) around talert with sx = maxt |x(t) − μx|/σx based on Z-Score. With sx, the final anomaly score of x (denoted as αx) is defined by (1). αx considers an infrastructure metric that satisfies sx < k as normal, where k is determined by the fluctuation level of the system during normal operation and DBAs’ tolerance for anomalies. In this work, we take k = 1 to avoid missing root cause metrics with slight fluctuations. Moreover, αx applies logarithmic conversion to sx as it limits a high sx getting higher to avoid extreme situations. We use the data points in the time period [talert − lp − ltrain, talert − lp] to learn μx and σx for each infrastructure metric x, where ltrain denotes the training data length, which is required to be long enough to model historical normal state more accurately. We shift the period with lp as the system may already be abnormal before the failure is observed at talert. For example, Fig. 2 shows the time delay between the occurrence of an actual failure observed in an Oracle database (indicated by the root cause metric) and the impact on the key metric, which are marked with two red lines. For the test data, we use the data points in the time period [t −l ,t +l ], alert p alert test where ltest denotes the test data length, which is decided by the real-time requirement of the system for root cause localization. However, ltest should be large enough to prevent the localization result from being affected by noise. # ai Here is the Python code for the algorithm you described: ```python import numpy as np def calculate_cause_rank(data, t_alert, l_train, l_test, l_p, k=1): # Define the training and testing time periods train_start = t_alert - l_p - l_train train_end = t_alert - l_p test_start = t_alert - l_test test_end = t_alert + l_test # Get the training and testing data train_data = data[train_start:train_end] test_data = data[test_start:test_end] # Calculate mean and standard deviation for each metric x mu_x = np.mean(train_data, axis=0) sigma_x = np.std(train_data, axis=0) # Calculate the Z-Score for each data point in the test data z_scores = np.abs((test_data - mu_x) / sigma_x) # Find the maximum Z-Score (s_x) for each metric x s_x = np.max(z_scores, axis=0) # Calculate the anomaly score (alpha_x) for each metric x alpha_x = np.where(s_x < k, 0, np.log(s_x)) return alpha_x ``` This function takes the input data, t_alert, l_train, l_test, and l_p, and returns the anomaly score (αx) for each metric x. Note that this code assumes the input data is a 2D numpy array with rows representing time and columns representing the different infrastructure metrics.