[[Telemetry]]のMap of Contentsページ。 ## Term - [[Telemetry]] - [[Software Telemetry]] - [[Observability]] - [[Observability Whitepaper]] - [[MELT]] - [[Monitoring VS Observability]] - [[Monitoring]] - [[テレメトリーシグナルの関係性]] - [[Scaling Telemetry Workloads]] ## Standard - [[OpenTelemetry]] - [[OpenTracing]] - [[OpenCensus]] - [[OpenMetrics]] - [[OpenCost]] ## Products - [[Prometheus]] - [[VictoriaMetrics]] - [[SigNoz]] - [[Apache SkyWalking]] - [[OpenObserve]] ## SaaS - [[Mackerel]] - [[Datadog]] - [[New Relic]] - [[Splunk]] - [[Logz.io]] - [[Zebrium]] - [[Sematext]] - [[Honeycomb]] - [[Cloudwise]] ## Books - [[Observability Engineering]] ## [[MELT]] ### General - [[2025__ACCESS__Observability in Microservices - An In Depth Exploration of Frameworks, Challenges, and Deployment Paradigms]] - [[2025__CSUR__Public Datasets for Cloud Computing - A Comprehensive Survey]] - [[2024__CloudNet__Using Observability to Detect Anti-patterns in Benchmarking Application with OpenTelemetry]] - [[2024__CLOUD__Enabling Programmable Metric Flows]] - [[2024__Thesis__Unified Application Observability in Heterogeneous Distributed Systems]] - [[Observabilityデータの使い分け]] - [[2024__IJISRT__Cost-Effective Scalability in Cloud Monitoring Systems - A Comparative Study]] - Octopus: [[2024__CLOUD__Intent-Driven Multi-Engine Observability Dataflows for Heterogeneous Geo-Distributed Clouds]] - [[2023__ACCESS__Towards the Observability of Cloud-native applications - The Overview of the State-of-the-Art]] - [[2022__Empirical Software Engineering__Enjoy your observability - An Industrial Survey of Microservice Tracing and Analysis]] - [[FudanSELab]]からのサーベイ論文 - [[2021__SIGMOD__Towards Observability Data Management at Scale]] - Slackの論文 - [[SlackのObservability論文]] - [[2018__CLOUD__Reviewing Cloud Monitoring - Towards Cloud Resource Profiling]] - [[2013__TC__Enhanced Monitoring-as-a-Service for Effective Cloud Management]] - [[2013__Computer Networks __Cloud monitoring - A survey]] - [[2013__ASE__Software Analytics for Incident Management of Online Services - An Experience Report]] ![[Distributed Tracing - MOC]] ![[Metrics - MOC]] ![[Logging - MOC]] ## Profiling - [[2025__TOSEM__Towards On-The-Fly Code Performance Profiling]] - [[2024__SOSP__FBDetect - Catching Tiny Performance Regressions at Hyperscale through In-Production Monitoring]] - [[2023__SOSE__Formal and Empirical Study of Metadata - Based Profiling for Resource Management in the Computing Continuum]] - [[2023__MPLR__Improving Garbage Collection Observability with Performance Tracing]] - [[Grafana Pyroscope]] ## Network - [[2024__SIGMOD__FineMon - An Innovative Adaptive Network Telemetry Scheme for Fine-Grained, Multi-Metric Data Monitoring with Dynamic Frequency Adjustment and Enhanced Data Recovery]] - [[2024__Dissertation__Deep Generative Models for Network Data Synthesis and Monitoring]] ## Runtime - [[2023__OSDI__Relational Debugging - Pinpointing Root Causes of Performance Problems]] - [[2014__OOPSLA__Statistical debugging for real-world performance problems]] - [[2011__OSDI__X-ray - Automating Root-Cause Diagnosis of Performance Anomalies in Production Software]] ## Workload Analysis - [[2023__SOSP__A Cloud-Scale Characterization of Remote Procedure Calls]] - [[2023__ATC__Lifting the veil on Meta's microservice architecture - Analyses of topology and request workflows]] - [[2021__SoCC__Characterizing Microservice Dependency and Performance - Alibaba Trace Analysis]] - [[2020__EuroSys__Borg - the next generation]] - [[2018__NSDI__Performance Analysis of Cloud Applications]] ## Visualization - [[2023__TVCG__A Qualitative Interview Study of Distributed Tracing Visualisation - A Characterisation of Challenges and Opportunities]] ## Error/Exception - [[2024__ASE__Do not neglect what’s on your hands - localizing software faults with exception trigger stream]] ## for [[LLM]] - [[AI Infra Telemetry - MOC]] ## HPC - [[2020__CLUSTER__MonSTer - An Out-of-the-Box Monitoring Tool for High Performance Computing Systems]] ## Network - [[2024__HotNets__Automatic Configuration Repair|Liu+, HotNets2024]] ## Visualization - [[2021__IV__μViz - Visualization of Microservices]] ## Others - [[Observability Considarations in Chaos]] - [[SREの中にObservabilityを位置づける]] - [[Observability User Stories]]