2024 Faculty Courses School of Computing Department of Computer Science Graduate major in Artificial Intelligence
Advanced Machine Learning
- Academic unit or major
- Graduate major in Artificial Intelligence
- Instructor(s)
- Naoaki Okazaki / Masamichi Shimosaka
- Class Format
- Lecture (HyFlex)
- Media-enhanced courses
- -
- Day of week/Period
(Classrooms) - 3-4 Tue / 3-4 Fri
- Class
- -
- Course Code
- ART.T458
- Number of credits
- 200
- Course offered
- 2024
- Offered quarter
- 2Q
- Syllabus updated
- Mar 14, 2025
- Language
- Japanese
Syllabus
Course overview and goals
This course introduces basic knowledge of machine learning and deep learning.
Course description and aims
[Goal]
- Understand basic concepts (e.g., classification, convex optimization) and methods (e.g., stochastic gradient descent, back propagation) for discriminative models of machine learning.
- Realize machine learning with toolkits and programming.
[Theme] The first half of this lecture covers basic concept of machine learning with linear models and optimization. The second half of this lecture presents the fundamentals and practices of deep learning.
Keywords
Machine learning, regression, classification, optimization, linear model, neural network, deep learning
Competencies
- Specialist skills
- Intercultural skills
- Communication skills
- Critical thinking skills
- Practical and/or problem-solving skills
Class flow
This lecture includes explanations and exercises of machine learning toolkits.
Course schedule/Objectives
Course schedule | Objectives | |
---|---|---|
Class 1 | introduction | Basic concept of Machine Learning |
Class 2 | Linear Model 1 | Loss functions, empirical loss minimization, overfitting, regularization, bias and variance, linear model (linear regression) |
Class 3 | Linear Model 2 & Optimization 1 | Linear model (classification), logistic regression, concept of optimization, gradient methods. |
Class 4 | Linear Model 3 & Optimization 2 | Support vector machines, constraint & convex optimization, duality |
Class 5 | Linear Model 4 | L1 regularization, sparse learning, Lasso |
Class 6 | Optimization 3 | Smoothness, Proximal gradient |
Class 7 | Scalable Learning | Stochastic gradient, accelerated gradients, momentum, mini-batch, distributed parallel training |
Class 8 | Introduction to Deep Learning | Real-world applications |
Class 9 | Feedforward Neural Network (I) | binary classification, Threshold Logic Units (TLUs), Single-layer Perceptron (SLP), Perceptron algorithm, sigmoid function, Stochastic Gradient Descent (SGD), Multi-layer Perceptron (MLP), Backpropagation, Computation Graph, Automatic Differentiation, Universal Approximation Theorem |
Class 10 | Feedforward Neural Network (II) | multi-class classification, linear multi-class classifier, softmax function, Stochastic Gradient Descent (SGD), mini-batch training, loss functions, activation functions, dropout |
Class 11 | Convolutional Neural Network | convolution, image filter, pooling, convolutional neural network, ImageNet, AlexNet, ResNet |
Class 12 | Word embeddings | word embeddings, distributed representation, distributional hypothesis, pointwise mutual information, singular value decomposition, word2vec, word analogy, GloVe, fastText |
Class 13 | DNN for structural data | Recurrent Neural Networks (RNNs), Gradient vanishing and exploding, Long Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), Recursive Neural Network, Tree-structured LSTM |
Class 14 | Encoder Decoder Modeling | language modeling, Recurrent Neural Network Language Model (RNNLM), encoder-decoder models, sequence-to-sequence models, attention mechanism, Convolutional Sequence to Sequence (ConvS2S), Transformer, ELMo, BERT |
Study advice (preparation and review)
To enhance effective learning, students are encouraged to spend approximately 100 minutes preparing for class and another 100 minutes reviewing class content afterwards (including assignments) for each class.
They should do so by referring to textbooks and other course material.
Textbook(s)
Handouts will be given when necessary.
Reference books, course materials, etc.
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press. 2016.
- Christopher Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics), 2010
Evaluation methods and criteria
Course marks are based on assignments (70%) and exercises (30%).
Related courses
- MCS.T507 : Theory of Statistical Mathematics
- MCS.T403 : Statistical Learning Theory
- CSC.T352 : Pattern Recognition
- CSC.T272 : Artificial Intelligence
- CSC.T242 : Probability Theory and Statistics
Prerequisites
None
Other
None.