Bioinformatics : DNA and protein sequences

4.0 credits

30.0 h + 15.0 h

Teacher(s)

Ghislain Michel (coordinator) ; Mahillon Jacques ;

Language

Français

Online resources

Icampus

Prerequisites

Introductory courses in biochemistry and molecular biology

Main themes

Bioinformatics refers to a set of concepts and tools that are required for the analysis of biological data and the interpretation of the results. This introductory course focuses on molecular biology databases (DNA and protein sequences), the algorithmic bases of the sequence analysis programs and on alignment score statistics. The course identifies the many pitfalls of interpreting data by giving a critical appraisal of the softwares used for sequence analysis.

Aims

a. Contribution de l'activité au référentiel AA (AA du programme)

Cohérence des AA cours en regard de ceux du programme

1.1, 1.2, 1.3

3.1, 3.2, 3.4, 3.5, 3.6

b. Formulation spécifique pour cette activité des AA du programme (maximum 10)

At the end of this course, students will be able to perform a comprehensive and exhaustive sequence analysis, using appropriate computational programs tools and internet resources. This ability requires:

- The understanding of the algorithmic bases of the computational programs

- The description of the various molecular databases with emphasis on the positive and negative aspects of data structure and search tools

- The discussion of the prediction results and eventually the proposition of a more appropriate analysis method

-A strategy for protein function forecasting

The contribution of this Teaching Unit to the development and command of the skills and learning outcomes of the programme(s) can be accessed at the end of this sheet, in the section entitled “Programmes/courses offering this Teaching Unit”.

Evaluation methods

Written examination in an open-book format, including theory questions and a sequence to be analysed using the computer programs discussed in the classroom.. Criteria used are:

- the understanding of the algorithmic bases of the sequence analysis programs

- the use of the most appropriate program and database

- the explanation of the statistical bases of prediction scores

Biological background, concise nature and clarity of the search are also required.

Teaching methods

The theoretical part consists of ex cathedra speeches in a classroom (30h). The training sessions (15h) consist of a set of problems to be resolved individually or by a group of 2 students, using free sequence analysis programs.

Content

1. Overview of basic concepts in biochemistry and molecular biology

2. Sequence and 3-D structure databases, protein motif and family databases

3. Database search tools

4. Sequence comparison : dot plot, global and local alignment based on a dynamic programming method and score matrices

5. Database searching for similar sequences (matching word-based method), score statistics

6. Multiple sequence alignment, motif discovery,(patterns, profiles, Hidden Markov models)

7. Analysis of protein hydropathy and prediction of RNA secondary structure

8. Phylogenetic inference using phenetic and cladistic methods

Bibliography

Syllabus, diaporamas and a set of problems will be available via icampus. The course is based on a reference book entitled:' Bioinformatics: Sequence and genome analysis' by D.W. Mount (CSHL press). However the purchase of this book is not required.

Other information

This course can be given in English.

Faculty or entity<

> AGRO

Programmes / formations proposant cette unité d'enseignement (UE)

Program title

Sigle

Credits

Prerequisites

Aims

Master [120] in Statistics: Biostatistics

bsta2m

Master [120] in Biochemistry and Molecular and Cell Biology

bbmc2m

Master [120] in Chemistry and Bioindustries

birc2m

Master [120] in Agricultural Bioengineering

bira2m

Master [60] in Biology

biol2m1