UT Fall 23 | VITA Group@UT Austin

Fall 23 - Advanced Topics in Computer Vision (ECE 381V/CS 395T)

Course title:

ECE 381V/CS 395T: Advanced Topics in Computer Vision

Term:

Fall 2023

Meeting times and location:

MW 1:30pm -3:00pm (ECJ 1.318)

After-class platform:

Slack (link sent to registered students)

Video recording:

Available on Canvas

Course Description and Prerequisites

This is a research-oriented advanced class that intends to focus on the latest frontier of computer vision. It describes computer vision algorithms that make sense of photographs, video, and other imagery. Applications include robotics, content creation, entertainment, medical image analysis, smart home, security, and HCI, among many others. Through this course, the students will digest and practice their knowledge and skills by many open discussions in classes, and will obtain in-depth experience with a particular research topic through a final project.

Students should have taken the following courses or equivalent: Introduction to Computer Vision (379K), Convex Optimization (381K-18), and Probability & Stochastic Process I (381J).

Previous knowledge of the following courses is helpful, but not necessary: Digital Video (381K-16), Statistical Machine Learning (381V), Data Mining (381L-10), or Cross-Layer Machine Learning HW/SW Design (382V).

Coding experiences with Python are necessary and assumed. Previous knowledge of C/C++, MATLAB or Tensorflow is very helpful, but not necessary.

Instructor Information

Name:

Dr. Zhangyang (Atlas) Wang

Telephone number:

512-471-1866

Email address:

atlaswang@utexas.edu

Office hour time:

Thursday 2:00pm - 3:00pm

Office hour location:

EER 6.886 (instructor office)

TA Information

TA 1 Name:

Zhangheng Li

Email address:

zoharli@utexas.edu

Office hour time:

Tuesday 11:00am - 12:00pm

Office hour location:

EER 3.854

Textbook and/or Resource Material

This course does not follow any textbook closely. Among many recommended readings are:

Grading Policies

Grading will be based on class participation (10%), one mid-term exam (15%), and one final project (75%) (milestone 1 progress report 15% + milestone 2 progress report 15% + presentation 20% + final report 15% + code review 10%). There will be no final exam.

8/21 Monday

8/23 Wednesday

8/28 Monday

8/30 Wednesday

9/04 Monday

9/06 Wednesday

9/11 Monday

9/13 Wednesday

9/18 Monday

9/20 Wednesday

9/25 Monday

9/27 Wednesday

10/02 Monday

10/04 Wednesday

10/09 Monday

10/11 Wednesday

10/16 Monday

10/18 Wednesday

10/23 Monday

10/25 Wednesday

10/30 Monday

11/01 Wednesday

11/06 Monday

11/08 Wednesday

11/13 Monday

11/15 Wednesday

11/20 Monday

11/22 Wednesday

11/27 Monday

11/29 Wednesday

12/04 Monday

Course Topics

Pattern Recognition and Machine Learning, Christopher M. Bishop (2006).
Computer Vision: Algorithms and Applications, Richard Szeliski (2010).
Deep Learning, Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016).
Diving into Deep Learning, Aston Zhang, Zack Lipton, Mu Li and Alex Smola (2019).

One project to receive the Best Project Award, voted by all class members. (+5%)
Projects in the novel, interdisciplinary domains (some examples: 5G/6G telecommunication, brain-computer interface, economics & markets, COVID-19, etc.), judged by the instructor. (+2%)
For late submission, each additional late day will incur a 10% penalty.

Topic I: Deep Vision Backbones (1): Building Blocks

Topic I: Deep Vision Backbones (2): Convolutional Neural Networks

Topic I: Deep Vision Backbones (3): More Advanced Architectures - Part i

Topic I: Deep Vision Backbones (4): More Advanced Architectures - Part ii Slides

- No Class (Labor Day) -

Topic II: Label-Efficient Learning (1): Semi-Supervised Learning

Topic II: Label-Efficient Learning (2): Few-Shot & Active Learning

Topic II: Label-Efficient Learning (3): Transfer & Self-Supervised Learning Slides

Topic III: Resource-Efficient Learning (1): Basic Model Compression

Topic III: Resource-Efficient Learning (2): Sparse Neural Networks - Part i

Topic III: Resource-Efficient Learning (3): Sparse Neural Networks - Part ii Slides

Topic IV: Neural Radiance Fields (1): Single Scene Fitting
[guest lecture by Peihao Wang]

Topic IV: Neural Radiance Fields (2): Scene-Generalizable Fitting
[guest lecture by Dejia Xu] Slides

Topic V: Robustness in Vision (1): Image Enhancement
[guest lecture by Dejia Xu]

Topic V: Robustness in Vision (2): Uncertainty and Domain Generalization

Topic V: Robustness in Vision (3): Adversarial Robustness Slides

Topic VI: AutoML and Meta Learning (1)

Topic VI: AutoML and Meta Learning (2) Slides

Midterm Exam

Topic VII: Good Old Days of Generative AI (1): VAEs and GANs - Part i

Topic VII: Good Old Days of Generative AI (2): VAEs and GANs - Part ii

Topic VII: Good Old Days of Generative AI (3): VAEs and GANs - Part iii Slides

Topic VIII: New Age of Generative AI (1): Introduction to Diffusion Models

Topic VIII: New Age of Generative AI (2): Deeper Dive into Diffusion Models

Topic VIII: New Age of Generative AI (3): Beyond Image: Video, 3D, and Multimodal

Topic VIII: New Age of Generative AI (4): Connecting Vision and Other "Foundation Models"

- No Class (Thanksgiving Break) -

Topic VIII: New Age of Generative AI (5): the Good, the Bad, and the Future Slides

Class Project Presentation (1)

Class Project Presentation (2)

Acknowledgement

Many materials included in this course are adapted from the existing teaching or tutorial slides, created by colleagues in CMU, Stanford, UIUC, UC Berkeley, GaTech, Microsoft, Google, Meta, DeepMind, NVIDIA, and more. The instructor owes many thanks for their generosity of sharing those materials publicly.