2025 (Current Year) Faculty Courses School of Computing Department of Mathematical and Computing Science Graduate major in Artificial Intelligence

Advanced Topics in Computer Vision

Academic unit or major: Graduate major in Artificial Intelligence
Instructor(s): Asako Kanezaki / Ikuro Sato / Satoshi Ikehata / Yusuke Sekikawa
Class Format: Lecture
Media-enhanced courses: -
Day of week/Period (Classrooms)
Class: -
Course Code: ART.T476
Number of credits: 200
Course offered: 2025
Offered quarter: 3Q
Syllabus updated: Apr 2, 2025
Language: English

Syllabus

Course overview and goals

Computer vision is a field that leverages the power of computers to extract meaningful information from visual data captured by optical sensors. This course provides an introduction to techniques ranging from 3D reconstructon methods to point-cloud processing methods. In particular, various deep learning-based methods are covered through lectures and exercises.

Course description and aims

At the end of this course, students should be able to
- explain basic concepts of 3D reconstruction and point-cloud processing.
- use computer-vision models appropriately with libraries.

Student learning outcomes

実務経験と講義内容との関連 (又は実践的教育内容)

The instructor has been conducting research and development in computer vision technologies, including object tracking, image retrieval, image recognition, image segmentation, and 3D reconstruction, particularly in the automotive industry (Dr. Sato).

Keywords

3D Reconstruction, Point-Cloud Processing, Geometric Transformation, Deep Learning

Competencies

Specialist skills
Intercultural skills
Communication skills
Critical thinking skills
Practical and/or problem-solving skills
Students will understand advance topics of computer vision includuding implementation.

Class flow

Slides and sample programs are used in the lecture.

Course schedule/Objectives

	Course schedule	Objectives
Class 1	3D Reconstruction (1/2)	Optical Flow, Epipolar Geometry, Singular Value Decomposition
Class 2	3D Reconstruction (1/2)	Rectification, Bandle Adjustment, Robust Estimation
Class 3	Exercise: 3D Reconstruction	Implementation of 3D Reconstruction Algorithms with MATLAB
Class 4	Input/output and rendering of 3D data, geometric transformation	Fundamentals of 3D data processing using Python libraries
Class 5	Sampling and Normal estimation	Sampling of 3D data and estimation of object normal vectors
Class 6	Keypoints and Features	Key Point Detection and Feature Extraction
Class 7	Point Cloud Registration (Basics)	Understanding k-d tree data structures and the nearest neighbor search, RANSAC, and ICP algorithms
Class 8	Point Cloud Registration (Practice)	Implementation of point cloud registration algorithms
Class 9	Pose estimation, primitive detection, segmentation	Object pose estimation, primitive detection, and segmentation using point cloud data
Class 10	Point Cloud Processing with Deep Models	PointNet and other deep learning models for point cloud processing, point cloud convolution
Class 11	RGBD, Voxel data, Mesh, Multi-view images, and Implicit functions	Various data formats other than 3D point clouds and implicit functions such as NeRF
Class 12	Vision for Autonomous Driving	Driving Environment Recognition, Path Planning
Class 13	Physics-Based Vision	Optical Properties, Photometric Stereo
Class 14	Event-Based Vision	Odometry Estimation, SLAM

Study advice (preparation and review)

To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class. They should do so by referring to textbooks and other course material.

Textbook(s)

None required.

Reference books, course materials, etc.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2n ed., Cambridge University Press, 2004.
Translated materials from the referene book A. Kanezaki, et al. (see Japanese syllabus) will be discussed.

Evaluation methods and criteria

Presentation video (70%) and attendence (30%)

Related courses

XCO.T489 ： Fundamentals of artificial intelligence
ART.T547 ： Multimedia Information Processing
ART.T463 ： Computer Graphics
ART.T465 ： Sparse Signal Processing and Optimization
ART.T475 ： Fundamentals of Computer Vision

Prerequisites

Students are required to have undergraduate-level knowledges on computer science, linear algebra, calculus, probability, and statistics. Students should be able to carry out practical exercises using programming languages such as Python.