トップページへ

2021 Faculty Courses School of Computing Department of Computer Science Graduate major in Computer Science

Advanced Data Engineering

Academic unit or major
Graduate major in Computer Science
Instructor(s)
Haruo Yokota
Class Format
Lecture
Media-enhanced courses
-
Day of week/Period
(Classrooms)
5-6 Mon / 5-6 Thu
Class
-
Course Code
CSC.T523
Number of credits
200
Course offered
2021
Offered quarter
2Q
Syllabus updated
Jul 10, 2025
Language
English

Syllabus

Course overview and goals

The data engineering is an active research area focuses on the sophisticated processing of a large amount of various data in computer systems, such as processing advanced databases.
This course aims to let students learn advanced methodologies and mechanisms for manipulating a large amount of data efficiently through understanding various contemporary technologies of data engineering, including application examples, data structures, indexing, processing algorithms, and parallel processing methods for highly functional and high-speed processing of a large amount of data.

Course description and aims

By the end of this course, students will be able to
1) Understand the basic concept of data engineering and its basics: Relational databases and transaction processing
2) Understand technologies for data warehouse as a typical application of data engineering
3) Understand data structure and algorithms of OLAP and data mining executed in the data warehouse
4) Understand implementation algorithms and costs of relational database operations for the data warehouse
5) Understand parallelization approaches for high-speed relational database operations
6) Understand skew handling methods for parallel database operations
7) Understand distributed database processing including a database in the cloud
8) Understand trends of recent XML/RDF databases

Keywords

Data Warehousing, OLAP, Data Mining, Indexing Methods, Parallel Database Operations, Data Placement, Skew Handling, Cloud Database, XML/RDF databases

Competencies

  • Specialist skills
  • Intercultural skills
  • Communication skills
  • Critical thinking skills
  • Practical and/or problem-solving skills

Class flow

Standard Lecture

Course schedule/Objectives

Course schedule Objectives
Class 1

Basic Concept and Background of Data Engineering

Understand the basic concept of data engineering

Class 2

Relational Database and Transaction Procesing

Understand relational databases and transaction processing

Class 3

Data Warehouse, OLAP, and Data Mining

Understand Data Warehouse, OLAP and Data mining

Class 4

Storing Data

Understand Storing Data

Class 5

Indexing

Understand Indexing

Class 6

Estimate Cost of Relational Algebra Operations 1: Selection, Projection

Understand Algorithms and Cost for Selection and Projection Operations

Class 7

Estimate Cost of Relational Algebra Operations 2: Join, Aggregate Functions

Understand Algorithms and Costs for Join Operation and Aggregate Functions

Class 8

Classify Parallelize Database Operations and Data Partitioning

Understand Classification of Parallel Database Processing and Data Distributiion

Class 9

Parallel Join Operations: Sort Merge Join, Hash Join

Understand Algorithm and Costs of Parallel Merge Sort Join and Hash Join

Class 10

Parallel Aggregate Functions, Skew Handling

Understand Algorithm and Cost of Parallel Aggregation Functions and Skew Handling

Class 11

Distributed Database Processing and Blockchain

Understand Distributed Database Processing and Blockchain

Class 12

Cloud and Databases

Understad Database Processing in Cloud Environment

Class 13

XML Databases

Understand XML Databases and RDF Databases

Class 14

Privacy and Security of Database

Understand Privacy and Security

Study advice (preparation and review)

To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.

Textbook(s)

Distribute manuscripts through OCW/OCW-i

Reference books, course materials, etc.

Jim Gray and Andreas Reuter著「Transaction Processing: Concept and Techniques」 Morgan Kaufmann Publishers,

Evaluation methods and criteria

Assignments in Lectures (60%) and Final Report (40%)

Related courses

  • CSC.T343 : Databases

Prerequisites

Basic knowledge of databases and computer architecture

Contact information (e-mail and phone) Notice : Please replace from ”[at]” to ”@”(half-width character).

yokota[at]cs.titech.ac.jp