Bioinformatics

5 credits

30.0 h + 30.0 h

Teacher(s)

Dupont Pierre; Ghislain Michel;

Language

English

Prerequisites

Students are expected to master the following skills :

implement and test a solution in the form of a software prototype and/or a numerical model,
demonstrate a good understanding of the basic concepts and the methodology of programming,
make a relevant choice between several data representations and algorithms to process them,
analyse a problem to provide an IT solution and implement it in a high level programming language,
understand and know how to apply in various stuations the basic concepts of probability and statistical inference,
use a scientific approach to extract reliable information from a data sample,

as covered within the courses LFSAB1401, LFSAB1402, LFSAB1105
The following skills are also useful. They are briefly reviewed at the beginning of the LGBIO2010 course :

explain the functions that take place in the cells of a living organism,
describe the basic concepts of molecular genetics,
define the different classes of biomolecules and their links within the cell processes and structures,

as covered within the courses LGBIO1111 and LBIR1220A

Main themes

Bioinformatics refers to a set of concepts and tools that are required for the analysis of biological data and the interpretation of the results. After a review of molecular biology basics and recent technologies for genome analysis, the course focuses on molecular biology databases (DNA and protein sequences), sequence comparison algorithms, identification of protein structural features (motifs), Hidden Markov models, selection of transcriptional markers, inference of transcriptional regulatory networks, and prediction of evolutionary relationship.

Aims

At the end of this learning unit, the student is able to :

With respect to the AA referring system defined for the Master in biomedical engineering, the course contributes to the development, mastery and assessment of the following skills :

AA1.1, AA1.2, AA1.3
AA2.2, AA2.4
AA4.3
AA5.3

At the end of this course, students will be able:

- to master the basic concepts of molecular biology for appropriate use of bioinformatics tools,

- to design and develop tools or methods for database management, information extraction and data mining,

- to formulate informed decisions between the many computational methods that are available for solving biological questions,

- to carry out a collaborative project aiming at the resolution of a bioinformatics problem and taking benefit from complementary student¿s education and expertise,

- to use the information available in major sequence databases (Genbank, Uniprot) with a critical mind and with discernment,

- to master a software environment (EMBOSS, R, Bioconductor).

The contribution of this Teaching Unit to the development and command of the skills and learning outcomes of the programme(s) can be accessed at the end of this sheet, in the section entitled “Programmes/courses offering this Teaching Unit”.

Content

Overview of basic concepts in biochemistry and molecular biology
Major Sequence and structure repositories and associated search tools
Sequence comparison
Sequence statistics
Pairwise sequence alignment
Database search for homology
Hidden Markov models
Multiple sequence alignment and profiles
Transcriptome profiling
Gene expression analysis
Gene regulatory networks
Molecular Phylogeny

Teaching methods

The theoretical part consists of ex cathedra lectures in a classroom (30h). The training sessions (30h) consist of a set of problems to be solved (mini-projects) and tutorials. The mini-projects are based on the algorithms discussed in the lectures. Teams of up to two students work on statistical and algorithmic aspects to solve biological problems, using a programming language of their choice (typically among R, Matlab, Python, or Perl). The tutorials introduce students to the methodology followed for protein function prediction, using the EMBOSS open software suite. The importance of the choice of the method and the analysis parameters is illustrated for common biological cases.

Evaluation methods

The first part of the written examination, in a closed-book format, focuses on algorithmic and statistical aspects, and accounts for 50% of the global note. The second part, in an open-book format, proposes a sequence to be analysed using the computer programs discussed in the classroom, and accounts for another 30%. The mini-projects account for 20% of the final evaluation marks. Students who failed the examination are not allowed to retake the miniprojects.

Other information

Tutorials on protein function prediction will be held in the computational room Cérès or Ulysse (Faculty of Bioscience Engineering)

Online resources

Moodle
http://moodleucl.uclouvain.be/course/view.php?id=8915

Bibliography

Les supports obligatoires sont constitués de l'ensemble des documents (transparents des cours magistraux, énoncés des travaux pratiques, compléments, ...) disponibles sur le site Moole du cours.
Les ouvrages suivants sont recommandés comme ressources complémentaires :
- Bioinformatics, Sequence and Genome Analysis, D. Mount, Cold Spring Harbord Laboratory Press, 2nd ed., 2004.
- Introduction to Computational Genomics : a case-study approach, N. Cristianini M. Hand, Cambridge University Press, 2007.
- Biological Sequence Analysis : Probabilistic Models of Proteins and Nucleic Acids, R. Durbin et al., Cambridge University Press, 1998.
- Inferring Phylogenies, J. Felsenstein, Sinauer Associates; 2nd ed., 2003.

Faculty or entity

GBIO

Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme

Sigle

Credits

Prerequisites

Aims

Master [120] in Data Science Engineering

DATE2M

Master [120] in Biomedical Engineering

GBIO2M

Master [120] in Mathematical Engineering

MAP2M

Master [120] in Computer Science and Engineering

INFO2M

Master [120] in Statistics: Biostatistics

BSTA2M

Master [120] in Computer Science

SINF2M

Master [120] in data Science: Information technology

DATI2M