SREcon23 EMEA Watch List - yuuk1's Digital Garden

[SREcon23 EMEA Conference Program | USENIX](https://www.usenix.org/conference/srecon23emea/program) ## General - Implementing SRE in a Telco with Reliability Enhancing Procedures ## Machine Learning - Symptom-based Alerting for Machine Learning - What I Learned from Monitoring More than 30 Machine Learning Use Cases - Reliable Data for Large ML Models: Principles and Practices - Overcoming Challenges in Serving Large Language Models - Artificial Intelligence: How Much Will It Cost You? ## SLI/SLO - 9 Things You Should Do When Starting to Use SLOs ## Observability - eBPF Superpowers for SRE - Tracing the Journey into Distributed Tracing - When Your Open Source Monitoring Tool Turns to the Dark Side - Should I Use OTel (collectors), or Is Prometheus Good Enough? - Implementing Open-source Observability within Maersk - Journey from Fluent Bit, Fluentd and Prometheus to OpenTelemetry Collector - Lessons Learned - Continuous Profiling in the Cloud-Native era - How to Use Prometheus's Native Histograms - Should an SRE Care About FinOps? Using Observability to Enable Resources Optimization ## Incident Response - When One Line Took Thousands of Websites Offline - The World Blew Up but We’re All Okay: How We Managed a Massive-scale Incident at Datadog - Embracing the Multi-Party Dilemma: Incident Response Across Company Boundaries - The Incident Is The Way: Using Your Incidents to Win Reliability Investment - That Time I Accidentally DDoS'd My Company - Deconstructing an Abstraction to Reconstruct an Outage ## IaC - Scaling Chef Emotionally ## Networking - Over, Under, Around, and Through: A Detailed Comparison of QUIC and HTTP/3 Application Mapping vs. Protocol Encapsulation - Deploying and Debugging HTTP/3 - Speedrun through Splicing Sockets with Sockmap - Cloud, Kubernetes, and Service Networking - Taming the Turtles - Monoceros: Faster and Predictable Services through In-pod Load-balancing - Level 7 Egress Control in Kubernetes: Current Solutions, Future Standards ## Security - Just the Cryptography You Need to Know for TLS ## Container - Sandboxing in Linux with Zero Lines of Code - Leveraging Unikernels and Kubernetes to (Transparently) Double Cloud Workload Performance ## Systems Performance - Cache Me If You Can: How Grafana Labs Scaled Up Their Memcached 42x & Cut Costs Too ## Automation - From Exceptional Maintenance to Automated Routine Operation: A Story of the Datacenter Switchover for Wikipedia ## Data management - Quash: Patterns for Data Lifecycle Management ## Cost - When Clouds Stop Raining Discounts: Surviving the Drought - Should an SRE Care About FinOps? Using Observability to Enable Resources Optimization ## Management & Culture - From Sysadmins to (almost) Flying Unicorns - Succeeding as the Lone SRE in a Small Team - New Grads Becoming New SREs: Catalyzing a “Circle of Life” in Ireland - How to Make Your Automation a Better Team Player - Dark Matter and Deep State: The Unseen Majority of Everything