Mobirise Website Builder

Machine Learning and Statistical Modeling for Multi-Omics:

January 2022

Coordinated by University of Turku and attended by members of the FindingPheno consortium together with close research collaborators. Target audience included vanced students and applied researchers who wish to develop their skills in multi-omics analysis.

Learning Objectives 

The workshop offers an overview of analytical tools for multi-omics studies in R. A particular focus is on multi-omics tools and techniques required to process microbial community data in combination with other omics. After the workshop, participants should be able to (i) preprocess and manipulate data, (ii) perform simple visualizations and statistical analyses, (iii) apply unsupervised and supervised machine learning, and (iv) produce robust and reproducible results.

Day 1 - Lectures

1. Welcome and Introduction  - Leo Lahti, Associate Professor (UTU) 
    Download ppt

​2. Introduction to Metagenomics - Katariina Pärnänen, Postdoctoral Researcher 
    (UTU) Download ppt

3. Completing the picture of microbiome study through the lens of Metabolomics -         
      Pande Putu Erawijantari, Postdoctoral Researcher (UTU) Download ppt

4. Introduction to Multi-omics  - Leo Lahti, Associate Professor (UTU)
    Download ppt

Day 1 - Practical Excercises

Led by: Tuomas Borman and Chouaib Benchraka, Research Assistants (UTU)

(a) Preparation: Instructions here for how to install R, R studio and R tools and how to install and
      load the required packages.

(b) Data import and structure: Data is stored using the MultiAssayExperiment (MAE) container,
      providing an organized way to bind several different data structures together in a single 
      object. Here are instructions for how to import a practice data set in this format.

​(c) Microbiome data exploration: Investigate how the taxonomic profiling data is organized in R,
     including aggregation and transformation.

​(d) Visualization: Data can be graphed using miaViz (instructions) to give a visual overview of the

​(e) Beta diversity: This measures the dissimilarity between samples by quantifying differences in
      the overall taxonomic composition between them. This section gives methods for measuring
      and visualizing the beta diversity present within the data.

Day 2 - Lectures 

1. Unsupervised and Supervised Machine Learning - Matti Ruuskanen, Postdoctoral
    Researcher (UTU) Download ppt

2. Introduction to Individual-based Modelling: Spatio-temporal models - Gergely
     Boza, Research Fellow (CER) Download ppt

​3. Data Integration Methods - Leo Lahti, Associate Professor (UTU)
    Download ppt

Day 2 - Practical Exercises

Led by: Tuomas Borman, Matti Ruuskanen and Chouaib Benchraka (UTU)

(a) Cross-correlation analysis: This allows for the analyzing of associations between variables, e.g.
      does a higher presence of a specific taxon equal higher levels of a biomolecule.

​(b) Unsupervised machine learning: Unsupervised learning tries to find information in unlabelled
      data. Examples given here are biclustering, a method which clusters rows and columns
      simultaneously, and MOFA, a factor analysis model that provides a general framework for
      integrating multi-omic data sets in an unsupervised fashion.

​(c) Supervised machine learning: These models learn a function to predict values of the dependent
      variable based on labeled data. Examples given here use random forests and the caret package
      to train regression and classification models to predict butyrate concentration based on
      microbiome composition.