2025 (Current Year) Faculty Courses School of Engineering Department of Systems and Control Engineering Graduate major in Systems and Control Engineering

Robot Audition and Scene Analysis

Academic unit or major: Graduate major in Systems and Control Engineering
Instructor(s): Kazuhiro Nakadai
Class Format: Lecture (Face-to-face)
Media-enhanced courses: -
Day of week/Period (Classrooms): 3-4 Tue (W9-325(W934))
Class: -
Course Code: SCE.I434
Number of credits: 100
Course offered: 2025
Offered quarter: 1Q
Syllabus updated: Apr 1, 2025
Language: English

Syllabus

Course overview and goals

In this course, students are expected to study the technologies on robot audition to realize the function of robot’s ears that can listen to simultaneous sounds, and scene understanding to analyze and understand the surrounding environment including sounds, from multiple perspectives, including the technologies in these research fields, their evolution, current progress, and future prospects. Specifically, the course will cover technologies for sound source localization, sound source separation, sound source classification, and automatic speech recognition, based on auditory processing, acoustic signal processing, and machine learning.

Course description and aims

By taking this course, students will acquire the following knowledge and skills:
 Understand and correctly explain multiple aspects of the research areas related to “sound” such as robot audition and scene understanding.
 Understand and explain sound-related technologies such as sound source localization, sound source separation, sound source classification, automatic speech recognition, and so on.

Student learning outcomes

実務経験と講義内容との関連 (又は実践的教育内容)

The class will be given by a professor who has established and led this research field since 2000.

Keywords

robot audition, scene analysis, acoustic signal processing, machine learning, deep learning, sound source localization, sound source tracking, sound source separation, sound source classification, automatic speech recognition

Competencies

Specialist skills
Intercultural skills
Communication skills
Critical thinking skills
Practical and/or problem-solving skills

Class flow

Topics of each class are explained according to planned structure. Group discussion may be conducted. Written reports related to the contents of the class may be assigned.

Course schedule/Objectives

	Course schedule	Objectives
Class 1	Overview and evolution of robot audition and scene analysi	Understand the overview and evolution of robot audition and scene analysis research areas with their relationships.
Class 2	Auditory scene analysis and computational auditory scene analysis	Review acoustic signal processing as a basis for robot audition and understand computational auditory scene analysis as a prior research areas of robot audition.
Class 3	Binaural robot audition	Understand the technology to listen to simultaneous sounds with two ears/microphones as humans and animals do.
Class 4	Microphone array-based robot audition	Understand sound source localization and sound source separation techniques using a microphone array consisting of multiple microphones.
Class 5	Robot audition in extreme environments – extreme audition	Understand the challenges and approaches to solve them through the application of robot audition technology to extreme environments.
Class 6	Robot audition using deep learning	Understand the overview, mechanisms, and technology trends of sound source localization, sound source separation, sound source classification, and automatic speech recognition using deep learning with neural networks.
Class 7	Software platforms and the future of robot audition	Understand the design concept, advantages, and challenges of HARK, an open source software platform for robot audition, from the perspective of applying it to real-world problems. Also discuss the future of robot audition and scene analysis technology.

Study advice (preparation and review)

To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.

Textbook(s)

Unspecified.

Reference books, course materials, etc.

Unspecified.

Evaluation methods and criteria

Comprehension and consideration of lecture content will be evaluated. Grading will be based on the assignments at each class and the final report.

Related courses

SCE.I406 ： Machine Learning Framework
SCE.I501 ： Image Recognition
SCE.I203 ： Digital Signal Processing
SCE.I201 ： Introduction to Measurement Engineering

Prerequisites

Students should have basic knowledge of digital signal processing and machine learning at the undergraduate level.

Contact information (e-mail and phone) Notice : Please replace from ”[at]” to ”@”(half-width character).

nakadai[at]ra.sc.e.titech.ac.jp