- [How do you manage your Machine Learning Experiments? | by Hady Elsahar | Medium](https://hadyelsahar.medium.com/how-do-you-manage-your-machine-learning-experiments-ab87508348ac)
- [実験管理について考える - Re:ゼロから始めるML生活](https://www.nogawanogawa.com/entry/experiment_management#%E5%90%84%E3%82%B5%E3%83%BC%E3%83%93%E3%82%B9%E3%82%92%E7%A2%BA%E8%AA%8D%E3%81%99%E3%82%8B)
## Knobs
- Code: Model architecture, Bug fixes, Evaluation Code, (Add / Fix) a Hyper-parameter
- Datasets: Change in datasets, preprocessing, manual fixing some examples.
- Debugging: those minor changes you always do to debug a certain model behaviour.
- Training: Hyperparameter tuning either manually, or automatically using hyper-param opt systems.
- Meta: experiment name, tag, time, what were you doing back then.
## Watchlists
- Evaluation Metrics: Accuracy, [[ROC]], [[BLEU]], [[ROUGE]] ..etc, not only which metric you use but which implementation of those metrics.
- Debugging and Intermediate Metrics: Training and dev loss and accuracy, Gradient per layer per epochs. System info like hostname, GPU memory %, GPU occupation %
## ツール
- [[Neptune.ai]]
- [[MLFlow]]
- [[Comet.ml]]
- [[WandB]]