Performance Evaluation
CSI 5180 - Machine Learning for Bioinformatics
Important
I have now published the descriptions for both the presentation and the project on the course website. You can access them through the following links:
- Project Description (outline due: Febrary 14, 2025)
- Assignment 1 (outline due: Febrary 24, 2025)
Prepare
- Sokolova, M. & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45(4), 427–437.
- Whalen, S., Schreiber, J., Noble, W. S. & Pollard, K. S. (2022). Navigating the pitfalls of applying machine learning in genomics. Nature Reviews Genetics, 23(3), 169–181.
- Rafi, A. M., Kiyota, B., Yachie, N. & Boer, C. G. de. (2025). Detecting and avoiding homology-based data leakage in genome-trained sequence models.
- Walsh, I., Fishman, D., Garcia-Gasulla, D., Titma, T., Pollastri, G., Group, E. M. L. F., Capriotti, E., Casadio, R., Capella-Gutierrez, S., Cirillo, D., Conte, A. D., Dimopoulos, A. C., Angel, V. D. D., Dopazo, J., Fariselli, P., Fernández, J. M., Huber, F., Kreshuk, A., Lenaerts, T., … Tosatto, S. C. E. (2021). DOME: recommendations for supervised machine learning validation in biology. Nature Methods, 18(10), 1122–1127.
- Olson, R. S., Cava, W. L., Mustahsan, Z., Varik, A. & Moore, J. H. (2018). Data-driven advice for applying machine learning to bioinformatics problems. Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing, 23, 192–203.
Participate
Further Readings
Evaluation is a comprehensive and multifaceted topic. For an in-depth exploration, I recommend consulting the following textbook.
- Nathalie Japkowicz and Mohak Shah. (2011). Evaluating Learning Algorithms: a classification perspective. Cambridge University Press, Cambridge, 2011.
Please refer to the section titled Reproducibility for Machine Learning in Life Science on the course links webpage for further information.