Tools for reproducible research

Tools For Reproducible Research

Tools for Reproducible Research

Course overview

GitHub repository Links to an external site.

20 - 24 November, 2023

SciLifeLab Stockholm, Sweden Links to an external site.

One of the key principles of proper scientific procedure is the act of repeating an experiment or analysis and being able to reach similar conclusions. Published research based on computational analysis (e.g. bioinformatics or computational biology) have often suffered from incomplete method descriptions (e.g. list of used software versions); unavailable raw data; and incomplete, undocumented and/or unavailable code. This essentially prevents any possibility of reproducing the results of such studies. The term “reproducible research” has been used to describe the idea that a scientific publication should be distributed along with all the raw data and metadata used in the study, all the code and/or computational notebooks needed to produce results from the raw data, and the computational environment or a complete description thereof.

Reproducible research not only leads to proper scientific conduct, but also enables other researchers to build upon previous work. Most importantly, the person who organizes their work with reproducibility in mind will quickly realize the immediate personal benefits: an organized and structured way of working. The person that most often has to reproduce your own analysis is your future self!

Course content and learning outcomes

The following topics and tools are covered in the course:

  • Data management
  • Project organisation
  • Git
  • Conda
  • Snakemake
  • Nextflow
  • Quarto
  • Jupyter
  • Docker
  • Singularity

At the end of the course, students should be able to:

  • Use good practices for data analysis and management
  • Clearly organise their bioinformatic projects
  • Use the version control system Git to track and collaborate on code
  • Use the package and environment manager Conda
  • Use and develop workflows with Snakemake and Nextflow
  • Use Quarto and Jupyter Notebooks to document and generate automated reports for their analyses
  • Use Docker and Singularity to distribute containerized computational environments

Application

This is an NBIS / Elixir course. The course is open for PhD students, postdocs, group leaders and core facility staff. International applications are welcome, but we will give approximately half of the participant slots to applicants from Swedish universities, due to the national role NBIS plays in Sweden.

The only entry requirements for this course is a basic knowledge of Unix systems (i.e. being able to work on the command line) as well as at least a basic knowledge of either R or Python.

Due to limited space the course can accommodate maximum of 20 participants. If we receive more applications, participants will be selected based on several criteria. Selection criteria include correct entry requirements, motivation to attend the course as well as gender and geographical balance.

Please note that NBIS training events do not provide any formal university credits. The training content is estimated to correspond to a certain number of credits, however the estimated credits are just guidelines. If formal credits are crucial, the student needs to confer with the home department before submitting a course application in order to establish whether the course is valid for formal credits or not.

By accepting to participate in the course, you agree to follow the NBIS Training Code of Conduct.

Course feel

A course fee of 3000 SEK will be invoiced to accepted participants. This includes lunches, coffee and snacks, and course dinner. Please note that NBIS cannot invoice individuals

Schedule

You can find the course schedule at this page.

Location

This course round is given on-site in Sweden at SciLifeLab Stockholm Links to an external site.. Travel directions are as follows:

  1. Transport yourself to Stockholm (see e.g. SJ’s website Links to an external site. for train travel within Sweden).
  2. Go to SciLifeLab in Solna (Tomtebodavägen 23A, 171 65 Solna Links to an external site.). The closest bus stop is called Karolinska institutet Biomedicum (search for public transport options here).
  3. Enter the SciLifeLab/Karolinska Institutet Science Park building. After entering, turn immediately left and pass through the glass door to find the rooms Air and Fire, where to course will take place. (There is a reception where you can ask for help if you cannot find the rooms.)

Course material

The pre-course setup page lists all the information you need before the course starts. The most important part is the installation and setup of all the tools used in the course, so make sure you’ve gone through it all for the course start. You can find the tutorials themselves (i.e. the content we will go through during the course) in the modules page.

All of the lectures used in this course is available at the lecture page, while the source code used to create the lectures is available under the lectures/ Links to an external site. directory on GitHub.

Teachers

  • John Sundh (course responsible)
  • Erik Fasterius (course responsible)
  • Verena Kutschera (teacher)
  • Tomas Larsson (teacher)
  • Estelle Proux-Wéra (teacher)
  • Lokeshwaran Manoharan (teacher)

Contact

To contact us, please send a mail to the follow address: edu.trr@nbis.se.

CC attribution share alike This course content is offered under a CC attribution share alike Links to an external site. license. Content in this course can be considered under this license unless otherwise noted.