Fault Localizationの性能指標AC@K - yuuk1's Digital Garden

[[2013__PER__Root Cause Detection in a Service-Oriented Architecture|MonitorRank]]などで定義されている精度の指標。 > To compare MonitorRank to the baseline methods, we require appropriate evaluation metrics. All the algorithms provide a rank of sensors with respect to each anomaly case. We refer to the rank of each sensor vi with respect to an anomaly a as ra(i) and define the indicator variable Ra(i) to represent whether sensor i is the root cause of an anomaly a or not (that is, either 0 or 1). To quantify the performance of each algorithm on a set of anomalies A, we use the following metrics: > Precision at top K (PR@K) indicates the probability that top K sensors given by each algorithm actually are the root causes of each anomaly case. It is important that the algorithm captures the final root cause at a small value of K, thereby resulting in lesser number of sensors to investigate. Here we use K = 1, 3, 5. More formally, it is defined as $ PR@K = \cfrac{1}{|A|}\sum_{a\in A}\cfrac{\sum_{i:r_a(i)\le K }R_a(i)}{min(K,\sum_{i}R_a(i))} $ 文献によってはAC@K と表記される。 ```python def precision_at_k(rankings, root_causes, k_values=[1, 3, 5]): """ Calculate precision at top K for each algorithm on a set of anomalies A. Args: rankings (dict): A dictionary of ranked sensor lists keyed by anomaly name. root_causes (dict): A dictionary of lists of root cause sensor indices keyed by anomaly name. k_values (list): A list of K values for which to compute PR@K. Returns: dict: A dictionary of precision values for each K value. """ pr_at_k = {k: 0 for k in k_values} num_anomalies = len(rankings) for anomaly, ranked_sensors in rankings.items(): root_cause_sensors = root_causes[anomaly] num_root_cause_sensors = len(root_cause_sensors) for k in k_values: top_k_sensors = ranked_sensors[:k] num_correct = sum([1 for sensor in top_k_sensors if sensor in root_cause_sensors]) pr_at_k[k] += num_correct / min(k, num_root_cause_sensors) for k in k_values: pr_at_k[k] /= num_anomalies return pr_at_k ``` This Python function calculates the precision at top K (PR@K) for a set of anomalies A. It takes in two dictionaries as arguments: `rankings`, which contains ranked sensor lists for each anomaly, and `root_causes`, which contains lists of root cause sensor indices for each anomaly. You can also pass a list of K values to compute PR@K for multiple K values. The function returns a dictionary with precision values for each K value. (wroted by ChatGPT)