Machine Learning for Life Sciences

cover-image-2.png

 

 Machine Learning for Life Sciences

The course provides an introduction to machine learning methods and workflows for life science research. It introduces the full end-to-end machine learning (ML) workflow, from data preprocessing and feature engineering to model training, evaluation, interpretation, and reproducible reporting, with a focus on the analysis of complex, high-dimensional biological data. Participants explore biological datasets using unsupervised methods such as dimensionality reduction and clustering, and build predictive models using supervised approaches including linear and tree-based models. Methods for multi-omics integration, including partial least squares (PLS), are introduced together with specialized modeling settings relevant to life sciences, such as mixed-effects models and survival analysis.

 

Next course

  • May 4th - 8th, 2026
  • Trippelrummet (E10:1307-9), Navet, BMC, Husargatan 3, 751 23 Uppsala

 

 Application
  • Applications will open soon. 
  • Fill in this form, and we'll notify you when we open the application.

 

 Important dates
  • Application deadline: March 29th, 2026
  • Confirmation to accepted students: April 1st, 2026
  • Course dates: May 4th - 8th, 2026

 

 Course content
  • Overview of the machine learning workflow
  • Dimensionality reduction methods such as PCA and UMAP
  • Unsupervised learning and clustering methods
  • Supervised learning models, including tree-based models
  • Partial least squares (PLS) for multi-omics integration
  • Mixed-effects models for analysis of repeated-measures and longitudinal data
  • Survival analysis methods for time-to-event data
  • Model training, evaluation and validation strategies
  • Model interpretation and explainable machine learning methods

 

outcomes-svgrepo-com.svg Learning outcomes
  • After completing the course, participants will be able to:

    • Explain the main components of the machine learning workflow and their role in life science research
    • Perform data preprocessing and exploratory analysis of high-dimensional biological datasets
    • Apply unsupervised learning methods to discover structure and generate biological hypotheses
    • Train, evaluate, and compare supervised learning models commonly used in life sciences
    • Apply specialized modeling approaches, including mixed-effects models for repeated measures and survival analysis for time-to-event data
    • Assess model performance using appropriate evaluation metrics and validation strategies
    • Interpret and communicate model results using explainable machine learning techniques
    • Apply basic principles of reproducible and FAIR machine learning workflows
    • Collaborate in interdisciplinary teams to design, implement, and present an ML-based data analysis

 

 Schedule

Preliminary course schedule can be found here.

 

Course format

The course is delivered through a combination of online and on-site teaching activities. It includes two preparatory online sessions (ca. 3h) held prior to the on-site module, an intensive five-day on-site meeting in Uppsala, and a concluding online session for project presentations and discussion. Teaching formats include lectures, live coding sessions, hands-on practical exercises, group discussions, and group-based mini-project work. Participants are expected to bring their own laptops for hands-on sessions.

 

exam-svgrepo-com-2.png Assessment

Examination consists of active participation in course activities, completion of a group-based mini-project, and a presentation of the mini-project.

 

 Prerequisites
  • Basic programming skills in R or Python, including working with data frames and running scripts
  • Prior exposure to basic statistical concepts (e.g. descriptive statistics, linear regression)
  • Familiarity with data analysis environments such as RStudio or Jupyter Notebooks

No prior experience with machine learning is required.

 

More on R and Python skills: 

  • Basic syntax and arithmetic (using the language as a calculator) (R: 1 + 2; Python: 1 + 2)
  • Core data structures: vectors/arrays, matrices, and data frames, including subsetting and basic matrix operations (R: vectors, matrices, data frames; Python: NumPy arrays, pandas DataFrames)
  • Reading data and managing files: (R: read_csv(), relative paths; Python: pandas.read_csv(), relative paths)
  • Inspecting and summarising data: (R: head(), tail(), sum(), min(), max(); Python: head(), tail(), sum(), min(), max())
  • Handling missing values (R: NA, na.rm = TRUE; Python: NaN, isna())
  • Writing simple control flow and functions (R: if/else, loops, functions; Python: if/else, loops, functions)
  • Finding and using documentation (R: help(), ?; Python: help(), docstrings)
  • Installing and loading/importing external packages (R: install.packages(), library(); Python: pip / conda, import)
  • Data transformation and manipulation (filtering rows, selecting columns, creating new variables) (R: tidyverse; Python: pandas)
  • Creating and interpreting basic plots, including simple customisation (labels, titles): (R: plot(), ggplot2; Python: matplotlib, seaborn)
  • Basic familiarity with reproducible documents: (R: R Markdown / Quarto; Python: Quarto / Jupyter)

 

 Fees
  • DDLS RS students: free
  • Academic participants: 3000 SEK
  • Non-academic participants: 15 000 SEK

includes lunches and coffee 

Please note NBIS cannot invoice individuals

 

travel-main.svg Travel info

For travel information and hotel bookings see Travel Information page 

 

 Course credits
  • 3 credits

 

 Teaching team
  • Olga Dethlefsen «olga.dethlefsen@scilifelab.se»
  • Payam Emami «payam.emami@scilifelab.se»
  • Eva Freyhult «eva.freyhult@nbis.se»
  • Miguel Redondo «miguel.angel.redondo@nbis.se»
  • Julie Lorent «julie.lorent@nbis.se»
  • Mun-Gwan Hong «mungwan.hong@nbis.se»

 

 Contact us

For questions regarding the course, please contact the course leaders at edu.ml-biostats@nbis.se or olga.dethlefsen@scilifelab.se.

 

CC attribution share alike This course content is offered under a CC attribution share alike license. Content in this course can be considered under this license unless otherwise noted.