SREのMap of Contentsページ。
## 用語・総論
- [[notes/sre/SRE]]
- [[信頼性]]
- [[SREの信頼性の定義]]
- [[SLO]]
- [[根本原因 - SRE]]
- [[Ironies of Automation]]
- [[DevOps]]
- [[State of DevOps Report]]
- [[SRE in the Third Age - SREcon19EMEA]]
- [[ソフトウェア異常の用語とプロセス]]
- [[SRE vs Platform Engineering]]
- [[ウェブオペレーション]]
## Papers
- [[2023__OSDI__Defcon - Preventing Overload with Graceful Feature Degradation]]
- [[2023__SIGCSE__Teaching Site Reliability Engineering as a Computer Science Elective]]
- [[2020__NSDI__Meaningful Availability]]
- [[2019__HotOS__Nines are Not Enough Meaningful Metrics for Clouds]]
- [[2010__SoCC__Characterizing Cloud Computing Hardware Reliability]]
## SLI/SLO
- [[The Art of SLOs]]
- [[Adopting SLOs]]
- [[Quality as an SLI]]
- [[Mackerel SLO API Quolity]]
- [[SLOの品質の改善]]
- [[SLOの起源]]
- [[SLOによる本番投入や切り戻し基準の設定]]
- [[2020__NSDI__Meaningful Availability]]
- [[メルカリのSLO運用]]
- [[はてなSLOモデル]]
- [[SLOツール]]
## [[Observability]]
- [[Telemetry - MOC]]
## Incident Management
- [[Incident Management - MOC]]
## [[Infrastructure as Code]]
## Conferences
### SRECon
- [[SREcon25 Americasまとめ]]
- [[SREcon24 Americas]]
- [[SREcon23 EMEA Watch List]]
- [[SREcon23 Americas Watch List]]
- [[SRECon22 America Watch List]]
- [[SRECon21 Watch List]]
- [[SRECon20 America]]
### [[SRE NEXT]]
- [[SRE NEXT 2024]]
- [[SRE NEXT 2023]]
- [[SRE NEXT 2022]]
## Software Reliability Engineering
- [[Software Reliability Engineering]]
## Reliability Engineering
- [[信頼性工学]]
- [[レジリエンス]]
- [[レジリエンス工学]]
- [[Resilience Engineering - Learning to Embrace Failure]]
## Books
- [[Site Reliability Engineering - Google]]
- [[📘Site Reliability Workbook]]
- [[Seeking SRE]]
- [[SREの格言]]
- [[Reliable Machine Learning - Applying SRE Principles to ML in Production]]
## Case Studies
- [[はてなのSRE - MOC]]
- [[SRE in Nikkei]]
## [[AIOps]]
- [[AIOps - MOC]]
## Others
- [[Principles of Software Engineering, Part 1]]
- [[2023__CLOSER__Semi-Automated Smell Resolution in Kubernetes-Deployed Microservices]]
- [[2021 SRE Report - Catchpoint]]
- [[The Morning Paper on Operability]]
- [[Platform Engineering]]
- [[Awesome Load Management]]
- [[Systems Empirical Study Papers]]
- [[A Conceptual Framework for System Fault Tolerance]]