[[2013__PER__Root Cause Detection in a Service-Oriented Architecture|MonitorRank]]などで定義されている精度の指標。
> To compare MonitorRank to the baseline methods, we require appropriate evaluation metrics. All the algorithms provide a rank of sensors with respect to each anomaly case. We refer to the rank of each sensor vi with respect to an anomaly a as ra(i) and define the indicator variable Ra(i) to represent whether sensor i is the root cause of an anomaly a or not (that is, either 0 or 1). To quantify the performance of each algorithm on a set of anomalies A, we use the following metrics:
> Precision at top K (PR@K) indicates the probability that top K sensors given by each algorithm actually are the root causes of each anomaly case. It is important that the algorithm captures the final root cause at a small value of K, thereby resulting in lesser number of sensors to investigate. Here we use K = 1, 3, 5. More formally, it is defined as
$
PR@K = \cfrac{1}{|A|}\sum_{a\in A}\cfrac{\sum_{i:r_a(i)\le K }R_a(i)}{min(K,\sum_{i}R_a(i))}
$
文献によってはAC@K と表記される。
```python
def precision_at_k(rankings, root_causes, k_values=[1, 3, 5]):
"""
Calculate precision at top K for each algorithm on a set of anomalies A.
Args:
rankings (dict): A dictionary of ranked sensor lists keyed by anomaly name.
root_causes (dict): A dictionary of lists of root cause sensor indices keyed by anomaly name.
k_values (list): A list of K values for which to compute PR@K.
Returns:
dict: A dictionary of precision values for each K value.
"""
pr_at_k = {k: 0 for k in k_values}
num_anomalies = len(rankings)
for anomaly, ranked_sensors in rankings.items():
root_cause_sensors = root_causes[anomaly]
num_root_cause_sensors = len(root_cause_sensors)
for k in k_values:
top_k_sensors = ranked_sensors[:k]
num_correct = sum([1 for sensor in top_k_sensors if sensor in root_cause_sensors])
pr_at_k[k] += num_correct / min(k, num_root_cause_sensors)
for k in k_values:
pr_at_k[k] /= num_anomalies
return pr_at_k
```
This Python function calculates the precision at top K (PR@K) for a set of anomalies A. It takes in two dictionaries as arguments: `rankings`, which contains ranked sensor lists for each anomaly, and `root_causes`, which contains lists of root cause sensor indices for each anomaly. You can also pass a list of K values to compute PR@K for multiple K values. The function returns a dictionary with precision values for each K value.
(wroted by ChatGPT)