To Top Page

2026 (Current Year) Faculty Courses School of Computing Undergraduate major in Computer Science

Biological Data Analysis

Academic unit or major
Undergraduate major in Computer Science
Instructor(s)
Masahito Ohue / Shogo Hamada / Masahiro Takinoue
Class Format
Lecture (Face-to-face)
Media-enhanced courses
-
Day of week/Period
(Classrooms)
7-8 Tue (W3-301(W331)) / 7-8 Fri (W3-301(W331))
Class
-
Course Code
CSC.T353
Number of credits
200
Course offered
2026
Offered quarter
2Q
Syllabus updated
Mar 5, 2026
Language
Japanese

Syllabus

Course overview and goals

This course focuses on data representation methods as well as comparative analysis and knowledge extraction algorithms for massive biological data. It also covers numerical simulations and nonlinear system analysis for dynamic biological systems. Topics include pairwise sequence alignment, dynamic programming, multiple sequence alignment, phylogenetic tree estimation, approximation methods, sequence motif representation, rapid homology search techniques for large-scale databases, protein structure modeling and prediction, biological system modeling, and numerical simulations of nonlinear differential equations.
No prior knowledge of biology or biochemistry is required. Basic biological concepts are introduced within the course, and students are expected to consider the topics from the perspectives of computational algorithms and their complexity.
Biological information analysis has become increasingly important in the 21st century for improving quality of life, the environment, and safety. This course therefore aims to provide students with a fundamental understanding of the nature of biological data and commonly used algorithms in this field, such as dynamic programming and differential equation-based system simulations. Many of the methods introduced in the course are also applicable to a wide range of engineering problems. The course is designed to serve as an illustrative example of how computer science techniques can be applied to real-world problems.

Course description and aims

By the successful completion of this course, students will be able to:
1) Explain several data representation for sequence analysis (e.g. regular expression, profile matrix, HMM),
2) Explain the notion and implementation of dynamic programing, as well as its several applications in bioinfomatics,
3) Explain the important role of approximated methods in multiple sequence alignment and phylogenetic tree estimation, in terms of computational complexity,
4) Explain the notion of e-value and p-value in homology search against a large database, and compute the values,
5) Explain several algorithmic techniques to make faster homology search against a large database, and
6) Explain analog and digital simulation approaches for behaviors of living cells.

Keywords

biological information, sequence analysis, dynamic programming, hidden Markov model, analog simulation, digital simulation

Competencies

  • Specialist skills
  • Intercultural skills
  • Communication skills
  • Critical thinking skills
  • Practical and/or problem-solving skills

Class flow

Each class starts from explanation of new topic (through notion, example, systems, applicational importance, etc.). At the end of class, students are given exercise problems related to the lecture given that day to solve.

Course schedule/Objectives

Course schedule Objectives
Class 1

Basics of simulation of living cells

Brief introduction to molecular biology and mathematical biology

Class 2

Sequence alignment

Calculate global/local sequence alignment based on dynamic programming and multiple sequence alignment

Class 3

Digital simulation of living cells (1)

Stochastic simulation 1

Class 4

Phylogenetic tree estimation

Calculate phylogenetic tree based on UPGMA method or NJ method, distance matrix method, character state method, and bootstrap evaluation

Class 5

Digital simulation of living cells (2)

Stochastic simulation 2

Class 6

Homology search against database

Calculate E-value and P-value for a hit in homology search

Class 7

Analog simulation of living cells (1)

Nonlinear differential equations and MATLAB

Class 8

Faster methods for sequence homology search

Build a k-mer index table for faster similarity search (FASTA, BLAST, PSI-BLAST)

Class 9

Analog simulation of living cells (2)

Nonlinear differential equations and nonlinear system analysis

Class 10

Protein structure analysis

Understand protein secondary/tertiary structures and analysis methods

Class 11

Analog simulation of living cells (3)

Typical examples of nonlinear systems

Class 12

Biomolecular design

Understand protein design and drug design

Class 13

Advanced topics in bioinformatics

Exploration of representative bioinformatics technologies in recent years

Class 14

Examination

Comprehensive topics in this class

Study advice (preparation and review)

To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.

Textbook(s)

Original slides are provided

Reference books, course materials, etc.

Ed. Japanese Society of Bioinformatics. Introduction to Bioinformatics 2nd edition, Keio University Press. ISBN: 978-4-7664-2791-2. (Japanese)

Evaluation methods and criteria

Students' knowledge of data representations, algorithms, and applications in biological information analysis, and their ability to apply them to problems will be assessed.
Final exams: 100%

Related courses

  • CSC.T362 : Numerical Analysis
  • ART.T543 : Bioinformatics
  • ART.T546 : Design Theory in Biological Systems
  • ART.T545 : Molecular Simulation
  • ART.T553 : Medical and Health Informatics
  • CSC.T242 : Probability Theory and Statistics
  • CSC.T254 : Machine Learning
  • CSC.T352 : Pattern Recognition
  • ART.T458 : Advanced Machine Learning

Prerequisites

none