[Systems Innovation - Microsoft Research](https://www.microsoft.com/en-us/research/group/systems-innovation/) Systems Innovationは、Microsoft 365、MSR、Azureの共同コラボレーションであり、当社のワークロードに対する深い理解を活用し、アルゴリズム研究とAI/ML技術、ハードウェアのイノベーションを組み合わせることで、運用効率と信頼性を一段と向上させ、持続可能性の目標を達成しながらクラス最高の生産性体験を提供することを目的としています。 Microsoft 365では、最大の生産性クラウドの1つを運営しており、AIワークロードの大幅な増加、持続可能性の推進、自己管理型クラウド環境の必要性、ムーアの法則の終焉とデナード・スケーリングがもたらす課題などのパラダイムシフトに歩調を合わせる必要があります。したがって、システム研究とイノベーションへの投資を拡大することが、当社の長期的な成功に不可欠であると考えています。 ## Projects [Systems Innovation: Projects - Microsoft Research](https://www.microsoft.com/en-us/research/group/systems-innovation/projects/) - [FLASH: A Reliable Workflow Automation Agent](https://www.microsoft.com/en-us/research/project/flash-a-reliable-workflow-automation-agent/) - [AIOps](https://www.microsoft.com/en-us/research/project/aiops/) - [Green Cloud Computing](https://www.microsoft.com/en-us/research/project/green-cloud-computing/) ## FLASH Papers - [[2024__MSResearch__FLASH - A Workflow Automation Agent for Diagnosing Recurring Incidents]] - [[2024__arXiv__Exploring LLM-based Agents for Root Cause Analysis]] ## AIOps Papers [[MicrosoftのAIOpsやインシデント管理に関する論文]] - [[2024__ICSE__UniLog - Automatic Logging via LLM and In-Context Learning]] - [[2024__ICSE__Xpert - Empowering Incident Management with Query Recommendations via Large Language Models]] - [[2023__arXiv__Assess and Summarize - Improve Outage Understanding with Large Language Models]] - [[2023__ESEC-FSE__STEAM - Observability-Preserving Trace Sampling]] - [[2023__ICSE__Incident-aware Duplicate Ticket Aggregation for Cloud Systems]] - [[2023__ICSE-SEIP__Aegis - Attribution of Control Plane Change Impact across Layers and Components for Cloud Systems]] - [[2023__ICSE-SEIP__CONAN - Diagnosing Batch Failures for Cloud Systems]] - [[2023__ICSE__Did We Miss Something Important? Studying and Exploring Variable-Aware Log Abstraction]] - [[2023__ICSE-SEIP__TraceArk - Towards Actionable Performance Anomaly Alerting for Online Service Systems]] - [[2022__ESEC-FSE__An Empirical Study of Log Analysis at Microsoft]] - [[2022__SIGKDD__NENYA - Cascade Reinforcement Learning for Cost-Aware Failure Mitigation at Microsoft 365]] - [[2022__ESEC-FSE__SPINE - A Scalable Log Parser with Feedback Guidance]] - [[2022__OSR__An Intelligent Framework for Timely, Accurate, and Comprehensive Cloud Incident Detection]] - [[2022__SIGKDD__Multi-task Hierarchical Classification for Disk Failure Prediction in Online Service Systems]] - [[2022__arXiv__UniParser - A Unified Log Parser for Heterogeneous Log Data]] - [[2022__ICSE__DeepTraLog - Trace-Log Combined Microservice Anomaly Detection through Graph-based Deep Learning]] - [[2021__KDD__HALO - Hierarchy-aware Fault Localization for Cloud Systems]] - [[2021__ISSRE__How Long Will it Take to Mitigate this Incident for Online Service Systems?]] - [[2021__ATC__Fighting the Fog of War - Automated Incident Detection for Cloud Systems]]