Alexander Golubev - yuuk1's Digital Garden

# Alexander Golubev Navigation: [[entities/_index]] | [[sources/_index]] [[Nebius AI]] 所属の研究者。マルチターン SWE エージェントの強化学習訓練に関する論文（arXiv:2508.03501, 2025）の筆頭著者かつ corresponding author（[email protected]）。棄却ファインチューニング（RFT）と DAPO を組み合わせ、Qwen2.5-72B-Instruct の SWE-bench Verified Pass@1 を 11% から 39% へ引き上げるパイプラインを開発した。同時期に SWE-rebench データセットの構築（arXiv:2505.20411）や非直列環境でのガイド付き探索（arXiv:2505.13652）にも共著者として参加しており、Nebius AI の SWE エージェント研究の中心的人物である。 ## 主な業績 - [[@2025__arXiv__Training Long-Context Multi-Turn SWE Agents with Reinforcement Learning]]（筆頭・corresponding） - SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents（arXiv:2505.20411、共著） - Guided Search Strategies in Non-Serializable Environments with Applications to Software Engineering Agents（arXiv:2505.13652、共著） ## 出典 - (Source: [[@2025__arXiv__Training Long-Context Multi-Turn SWE Agents with Reinforcement Learning]])