Pre-course materials

Pre-course materials

The Pre-course setup is given in: https://github.com/UPPMAX/programming_formalism/blob/main/setup.md Links to an external site.

And re posted here except the last markdown part that does not run in the canvas static html version.

 

Setup

Parts taken from https://nbis-reproducible-research.readthedocs.io/en/course_2104/setup/ Links to an external site. and https://coderefinery.github.io/installation/ Links to an external site.

Shell and Git

Setup for Mac / Linux users

  • We will use terminal to some extent.

  • Choose one of your choice, the built-in or another!

  • Chances are that you already have git installed on your computer. You can check by running e.g. git --version.

  • If you don't have git, install it following the instructions here Links to an external site..

  • If you have a very old version of git you might want to update to a later version.

Setup for Windows users

There are several different ways to run the course material on a Windows computer. Neither is perhaps optimal, and the material itself has not been adapted specifically for Windows. Nevertheless, in principle everything should be possible to run. A few ways you could setup:

Running in the Linux Bash Shell on Windows 10

This will give you access to a full command-line bash shell based on Linux on your Windows 10 PC. For the difference between the Linux Bash Shell and the PowerShell on Windows 10, see e.g. this article Links to an external site..

Install Bash on Windows 10 (WSL), following the instructions at e.g. 1 of these resources:

Configure git

Follow these instructions. https://nbis-reproducible-research.readthedocs.io/en/course_2104/setup/#installing-git Links to an external site.

Github

Sign up for GitHub account: https://coderefinery.github.io/installation/github/ Links to an external site.

Git/Github connection through ssh keys (This may take a while to get working, but is worth it)

https://coderefinery.github.io/installation/ssh/ Links to an external site.

Miniconda3

PlantUML

  • We will use the tool PlantUML to render UML code to graphical diagrams and flowcharts.
  • If you want PlantUML to render directly from a file on GitHub please install the extension PlantUML viewer to your web browser.
    • Firefox: PlantUML viewer
    • Chrome: Pegmatite
    • Microsoft Edge Markdown Diagrams

Additional reading materials for day 3 and 4:

https://github.com/NBISweden/development-guidelines

Programming with Python
The course will be taught using both R and Python depending on the tools available. While you will be able to follow all lectures and exercises conceptually, it is easier if you get acquainted with programming in both languages:

Conda instructions
We will be using conda environments for each session to ensure reproducibility. Follow the following installation instructions:


In this workshop you will use conda environments to run the exercises. This is because conda environments allow all students to have the save computing environment, i.e. package versions. This enforces reproducibility for you to run this material without the need to re-install or change your local versions. See a graphical example below:

conda_illustration.png  

Conda environments Links to an external site. are a self-contained directory that you can use in order to reproduce all your results.

Briefly, you need to:

  1. Install Conda and download the .yml file
  2. Create and activate the environment
  3. Deactivate the environment after running your analyses

Refer to this conda cheat sheet Links to an external site.

 

 

Install Conda and download the environment file


You should start by installing Conda. We suggest installing either Miniconda3 (NOT Anaconda). After installing Conda Links to an external site., download the conda file and put it in your working folder.

On MacOSX

Download MacOX SDK compiler package: MacOSX10.9.sdk.tar.xz Links to an external site.

Then extract it, copy it to the /opt/ folder. Conda will use it in that location by default.

#extract
sudo tar -xzf MacOSX10.9.sdk.tar.xz

#copy
sudo cp -r ~/Downloads/MacOSX10.9.sdk /opt/

#give executable permissions
sudo chmod -R a+rX /opt

Then you can install conda:

curl -o Miniconda3-latest-MacOSX-x86_64.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
sh Miniconda3-latest-MacOSX-x86_64.sh

Follow the instructions on screen replying yes when necessary. Restart your terminal window to apply modifications.

On Ubuntu

Inside Ubuntu, open TERMINAL and type the commands below to install the X-server graphical packages that will be used to launch RStudio. https://docs.anaconda.com/anaconda/install/linux/ Links to an external site.

sudo apt-get update
sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6

Then download Miniconda3 and install it.

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh

Follow the instructions on screen replying yes when necessary. Restart your terminal window to apply modifications.

On Windows10

Several packages are not available for Windows. However, on windows10 we can run a Ubuntu subsystem to overcome this issue. Please follow the instructions Alternative option on Windows (WLS) below to install it.

 

Create a Conda environment from file


In the labs page, we provide links for each environment for each hands-on session. To download an environment file such as env-merged_nets.yaml using the command on Terminal:

#Ubuntu
wget https://raw.githubusercontent.com/NBISweden/workshop_omics_integration/master/environments/env-merged_nets_linux.yaml
#MacOSX
curl -o env-merged_nets.yaml https://raw.githubusercontent.com/NBISweden/workshop_omics_integration/master/environments/env-merged_nets.yaml

After this, you should have a file named env-topology.yaml in your directory (it does not matter where, you can save on Downloads folder for example). Next, type:

#Ubuntu
conda env create -n envnets -f env-merged_nets_linux.yaml
#MacOSX
conda env create -n envnets -f env-merged_nets.yaml

Several messages will show up on your screen and will tell you about the installation process. This may take a few minutes depending on how many packages are to be installed.

##Collecting package metadata: done
##Solving environment: done
##
##Downloading and Extracting Packages
##libcblas-3.8.0       | 6 KB      | ############################################################################# | 100%
##liblapack-3.8.0      | 6 KB      | ############################################################################# | 100%
##...
##Preparing transaction: done
##Verifying transaction: done
##Executing transaction: done

 

Activate the environment


Once the environment is created, we need to activate it in order to use the softwares and packages inside it. To activate an environment type:

conda activate envnets

From this point on you can run any of the contents from a given environment. For instance, you can directly launch RStudio by typing rstudio or jupyter with jupyter-notebook. Here it is important to add the & symbol in the end to be able to use the command line at the same time if needed. You can open other files from Rstudio later as well.

rstudio ./labs/compiled/my_script.Rmd &

 

Deactivate the environment


After you've ran all your analyses, you can deactivate the environment by typing:

conda deactivate

Remember to occasionally clean the downloaded conda packages

conda clean --all

 

 

Alternative option on Windows (WLS)


Unfortunately, not all packages available on conda are compatible with windows machines. The good news is that is changed on windows10, in which they offer native linux support via the Windows Subsystem for Linux (WSL2). This allows you to run linux/bash commands from within windows without the need of a virtual machine nor a dual-boot setup (i.e. having 2 operational systems). However, WSL does not offer a complete support for graphical interfaces (such as RStudio in our case), so we need an additional steps to make that happen.

  1. On Windows10, install the WSL if you don't have it. Follow the instructions here: https://docs.microsoft.com/en-us/windows/wsl/install-win10 Links to an external site.

  2. Once you have that installed, you can download and install MobaXterm (which is the enhanced terminal with graphical capacity): https://mobaxterm.mobatek.net Links to an external site.

  3. Inside MobaXterm, you will probably will see that your WSL is already listed on the left panel as an available connection. Just double-click it and you will be accessing it via MobaXterm. If by any chance you don't see it there, close MobaXterm and go to the WSL terminal, because probably the WSL is not allowing SSH connections. You can follow this link Links to an external site. for the instructions on how to do it. You need to complete until the step Start or restart the SSH service, while the further steps are optional, but might be useful.

  4. Inside MobaXterm, download Conda with the command:

    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    
  5. Inside MobaXterm, type the command below to install Conda. Follow the instructions for the installation there.

    sh Miniconda3-latest-Linux-x86_64.sh
    
  6. Inside MobaXterm, type the commands below to install the X-server graphical packages that will be used to launch RStudio. https://docs.anaconda.com/anaconda/install/linux/ Links to an external site.

    sudo apt-get update
    sudo apt-get upgrade
    sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
    
  7. Close Terminal to apply the CONDA updates. Then you can create a course folder, download the environment file and create the environment:

    mkdir /mnt/c/Users/[your_username]/Desktop/course
    cd /mnt/c/Users/[your_username]/Desktop/course
    wget https://raw.githubusercontent.com/NBISweden/workshop_omics_integration/master/environments/env-merged_nets_linux.yaml
    conda env create -n envnets -f env-merged_nets_linux.yaml
    
  8. You can then follow the instructions to activate/deactivate the environment and launch rstudio / jupyter

    conda activate envnets
    rstudio &
    #for jupyter:
    jupyter-notebook
    

 

 

Alternative option (VIRTUALBOX)


If by any means you see that the installations are not working as it should on your computer, you can try to create a virtual machine to run UBUNTU and install everything there.

  1. Download and install on your machine VIRTUALBOX https://www.virtualbox.org Links to an external site.

  2. Download the ISO disk of UBUNTU https://ubuntu.com/download/desktop Links to an external site.

  3. On VIRTUALBOX, click on Settings (yellow engine) > General > Advanced and make sure that both settings Shared Clipboard and Drag'n'Drop are set to Bidirectional.

  4. Completely close VIRTUALBOX and start it again to apply changes.

  5. On VIRTUALBOX, create a machine called Ubuntu and add the image above

  6. set the memory to the maximum allowed in the GREEN bar
  7. set the hard disk to be dynamic allocated
  8. all other things can be default

  9. Proceed with the Ubuntu installation as recommended. You can set to do "Minimal Installation" and deactivate to get updates during installation.

  1. Inside Ubuntu, open TERMINAL and type the commands below to install the X-server graphical packages that will be used to launch RStudio. https://docs.anaconda.com/anaconda/install/linux/ Links to an external site.

    sudo apt-get update
    sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6
    
  2. Inside UBUNTU, Download conda:

    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    
  3. Inside UBUNTU, open the TERMINAL and type the commands below. Follow the instructions for the installation there.

    cd ~/Downloads
    sh Miniconda3-latest-Linux-x86_64.sh
    
  4. Close Terminal to apply the CONDA updates. Then you can create a course folder, download the environment file and create the environment.

    mkdir ~/Desktop/course
    cd ~/Desktop/course
    wget https://raw.githubusercontent.com/NBISweden/workshop_omics_integration/master/environments/env-merged_nets_linux.yaml
    conda env create -n envnets -f env-merged_nets_linux.yaml
    
  5. You can then follow the instructions above to activate/deactivate the environment.

    conda activate envtopology
    rstudio &