2025 (Current Year) Faculty Courses School of Computing Department of Mathematical and Computing Science Graduate major in Artificial Intelligence
Advanced Topics in Computer Vision
- Academic unit or major
- Graduate major in Artificial Intelligence
- Instructor(s)
- Asako Kanezaki / Ikuro Sato / Satoshi Ikehata / Yusuke Sekikawa
- Class Format
- Lecture
- Media-enhanced courses
- -
- Day of week/Period
(Classrooms) - Class
- -
- Course Code
- ART.T476
- Number of credits
- 200
- Course offered
- 2025
- Offered quarter
- 3Q
- Syllabus updated
- Apr 2, 2025
- Language
- English
Syllabus
Course overview and goals
Computer vision is a field that leverages the power of computers to extract meaningful information from visual data captured by optical sensors. This course provides an introduction to techniques ranging from 3D reconstructon methods to point-cloud processing methods. In particular, various deep learning-based methods are covered through lectures and exercises.
Course description and aims
At the end of this course, students should be able to
- explain basic concepts of 3D reconstruction and point-cloud processing.
- use computer-vision models appropriately with libraries.
Student learning outcomes
実務経験と講義内容との関連 (又は実践的教育内容)
The instructor has been conducting research and development in computer vision technologies, including object tracking, image retrieval, image recognition, image segmentation, and 3D reconstruction, particularly in the automotive industry (Dr. Sato).
Keywords
3D Reconstruction, Point-Cloud Processing, Geometric Transformation, Deep Learning
Competencies
- Specialist skills
- Intercultural skills
- Communication skills
- Critical thinking skills
- Practical and/or problem-solving skills
- Students will understand advance topics of computer vision includuding implementation.
Class flow
Slides and sample programs are used in the lecture.
Course schedule/Objectives
Course schedule | Objectives | |
---|---|---|
Class 1 | 3D Reconstruction (1/2) | Optical Flow, Epipolar Geometry, Singular Value Decomposition |
Class 2 | 3D Reconstruction (1/2) | Rectification, Bandle Adjustment, Robust Estimation |
Class 3 | Exercise: 3D Reconstruction | Implementation of 3D Reconstruction Algorithms with MATLAB |
Class 4 | Input/output and rendering of 3D data, geometric transformation | Fundamentals of 3D data processing using Python libraries |
Class 5 | Sampling and Normal estimation | Sampling of 3D data and estimation of object normal vectors |
Class 6 | Keypoints and Features | Key Point Detection and Feature Extraction |
Class 7 | Point Cloud Registration (Basics) | Understanding k-d tree data structures and the nearest neighbor search, RANSAC, and ICP algorithms |
Class 8 | Point Cloud Registration (Practice) | Implementation of point cloud registration algorithms |
Class 9 | Pose estimation, primitive detection, segmentation | Object pose estimation, primitive detection, and segmentation using point cloud data |
Class 10 | Point Cloud Processing with Deep Models | PointNet and other deep learning models for point cloud processing, point cloud convolution |
Class 11 | RGBD, Voxel data, Mesh, Multi-view images, and Implicit functions | Various data formats other than 3D point clouds and implicit functions such as NeRF |
Class 12 | Vision for Autonomous Driving | Driving Environment Recognition, Path Planning |
Class 13 | Physics-Based Vision | Optical Properties, Photometric Stereo |
Class 14 | Event-Based Vision | Odometry Estimation, SLAM |
Study advice (preparation and review)
To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class. They should do so by referring to textbooks and other course material.
Textbook(s)
None required.
Reference books, course materials, etc.
R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2n ed., Cambridge University Press, 2004.
Translated materials from the referene book A. Kanezaki, et al. (see Japanese syllabus) will be discussed.
Evaluation methods and criteria
Presentation video (70%) and attendence (30%)
Related courses
- XCO.T489 : Fundamentals of artificial intelligence
- ART.T547 : Multimedia Information Processing
- ART.T463 : Computer Graphics
- ART.T465 : Sparse Signal Processing and Optimization
- ART.T475 : Fundamentals of Computer Vision
Prerequisites
Students are required to have undergraduate-level knowledges on computer science, linear algebra, calculus, probability, and statistics. Students should be able to carry out practical exercises using programming languages such as Python.