Distributed Tracing - yuuk1's Digital Garden

[[Distributed Tracing - MOC]] ## Definition > Distributed tracing [41] is a method that comes from traditional tracing, but applied to a distributed system at the work-flow level. Unlike simple logging, tracing must relate information from different parts of the system, to order events according to some order, like Lamport’s happens-before relation [23], serving multiple purposes, such as identifying the root-cause of anomalies or perform distributed profiling, and monitor applications, especially those built using microservice architectures and, in the end, it can be used to pinpoint failures and reason about their root cause. > 分散トレーシング[41]は、従来のトレーシングに由来する手法であるが、ワークフローレベルで分散システムに適用される。単純なロギングとは異なり、トレーシングはシステムの異なる部分からの情報を関連付け、Lamportのhappens-before関係[23]のように、ある順序に従ってイベントを順序付ける必要があり、異常の根本原因を特定したり、分散プロファイリングを実行したり、アプリケーション、特にマイクロサービスアーキテクチャを使用して構築されたアプリケーションを監視したりするなど、複数の目的に役立ちます。 [41] : [[2016__SoCC__Principled workflow-centric tracing of distributed systems]] [23] : [[1978__Communications of the ACM__Time, clocks, and the ordering of events in a distributed system]] [[2021__JGC__Automated Analysis of Distributed Tracing - Challenges and Research Directions]]より引用