STAT 892
Computational Molecular Biology
Instructor: Istvan (Steve) Ladunga, Ph.D.
Course Information
Credit hours: 3
1 hour 15 minutes lecture, 1 hour 50 minutes computer laboratory.
Term Offered: I (Fall 2008).
Rooms:
Lecture: Mondays, 3-4:15 pm. N172 Beadle
Computer Laboratory: Wednesdays, 2:30-5:20 pm, Rm. 14, Center for Business Adminstration Bldg.
Prerequisites
Undergraduates need permission. Any of the following three courses:- BIOS 101 General Biology OR
- BIOS 206 General Genetics OR
- BIOC 321 Elements of Biochemistry
- STAT 218 Introduction to Statistics.
- BIOC 432/832 Gene Expression and Replication, and
- BIOS 477/877 Bioinformatics and Molecular Evolution
Scope
Computational molecular biology helps us to do biology on computers. In ten years, this term will make no sense, since almost all biology will be computational, an obvious trend. Computational molecular biology enables us, among many others, to- Think and argument in probabilistic terms and statistical significance
- Search over enormous databases in minutes or seconds
- Compare whole genomes, display parts of them in Genome Browser
- Identify genes that are up- or downregulated in disease or stress conditions, as indicated by gene expression microarray experiments.
- Understand the complex networks of transcriptional regulation.
- Discover chromatin structure
- Compare your experiments to those of others
- Understand the signal-to-noise ratio of your experiment, estimate the statistical significance of the observations
- Perform (semi)automatic parsing of the literature over millions of publications.
- Use ontology terms, pathways, gene set enrichment analysis.
- Design short interfering RNA molecules
Who would benefit from taking this course?
This course is designed first of all for biology, agronomy and statistics students. However, computer science, mathematics, physics and chemistry majors also may find it beneficial. This course is designed to benefit both computational biologists and experimental biologists to understand the principles of analyzing biological data, building models and testing hypotheses on computers. This Course does not depend on any graduate course.
Necessary background
- Some knowledge of molecular biology is necessary.
- You should be able to navigate the Internet, use Microsoft Office including Word, Powerpoint and Excel.
- No programming or database skills are required.
Lecture Topics/Course Outline and Assessment Plan
I believe that one of the most critical but somewhat overlooked skill is reading, understanding and presenting scientific publications that are reasonably challenging and matching to your background (e.g., biology, statistics, or computer science). Each student will be assigned a book chapter and/or a journal article to present during the computer labs using PowerPoint or similar presentations.
Homeworks are both written and web-based analysis assignments. Your final grade will be based on the following scale:
| Minimum Percent Score | Grade |
| 97 | A+ |
| 93 | A |
| 90 | A- |
| 87 | B+ |
| 83 | B |
| 80 | B- |
| 77 | C+ |
| 73 | C |
| 70 | C- |
| 0 | F |
- 35 percent based on the final project
- 25 percent on publication presentations
- 25 percent on test assignments
- 15 percent on class participation
Methods
Lecture (75 minutes) and computer laboratory (110 minutes). Lectures will be delivered at the Beadle Center, laboratory work will be performed at Rm. 14., Center for Business Administration. Project and homework will rely on the servers and node computers of the Bioinformatics Core Research Facility.References/Textbooks
We do not have any mandatory textbooks. Most of the Course will be taught on the basis of recent scientific review publications.
Elective Textbook: Baxevanis, A.D. and Ouellette (eds.) Bioinformatics. A Practical Guide to the Analysis of Genes and Proteins. (2005) Third Edition. Wiley Interscience. ISBN 0-471-47878-4. 540 pages.
We will also use selected chapters from the series: Current Protocols in Bioinformatics (Wiley Interscience). This series provides both theoretical foundations and practical instructions to the most important bioinformatics algorithms and tools.
For literature searches I recommend: Jensen, L.U. (2006) Biological literature mining – from information retrieval to biological discovery, Tutorial, International Society for Computational Biology.
A VERY preliminary schedule of classes
The order and subject of the classes may change.
| Lecture 1 | Aug. 25 | Major trends in computational and experimental biology, roadmap to the Course, Science 2020 |
| Lecture 2 | Sep. 1 | Biological databases and search methods. |
| Lecture 3 | Sep. 8 | Hidden Markov Models and Bayesian Inference |
| Lecture 4 | Sep. 15 | Biostatistical Analysis of Gene Expression Microarrays, Part I. |
| Lecture 5 | Sep. 22 | Biostatistical Analysis of Gene Expression Microarrays, Part II. |
| Lecture 6 | Sep 29 | The Gene Ontology and its Use with Gene Expression Experiments. |
| Lecture 7 | Oct. 6 | Pathway Analysis and Databases, Gene Set Enrichment Analysis, Kyoto Encyclopedia of Genes and Genomes, Computational Biology Pipelines. |
| Lecture 8 | Oct. 13 | Artificial Intelligence. Support Vector Machines and Their Application in Biology. |
| Lecture 9 | Oct. 27 | Literature Parsing in Biology. |
| Lecture 10 | Nov. 3 | Chromatin Immunoprecipitation, Next-generation sequencing, Computational Prediction of Transcription Factor Binding Sites. |
| Lecture 11 | Nov. 10 | Comparative Genomics. |
| Lecture 12 | Nov. 17 | Proteomics |
| Lecture 13 | Nov. 24 | Systems Biology: Integration of Genomic, Gene Expression, Proteomics, and Regulatory Observations. |
| Lectures 14 and 15 | Dec 1-8 | Student project presentations |