
Fall 22 - Introduction to Computer Vision
Course title:
EE 379K: Introduction to Computer Vision
Term:
Fall 2022
Meeting times and location:
TR 5:00-6:30pm (ETC 2.136)
After-class platform:
Slack (link sent to registered students)
Video recording:
No (fully in-person)
Course Description and Prerequisites
Computer vision (CV) is the discipline of “teaching machines how to see”: it makes sense of photographs, video, and other imagery. Applications include analysis of medical images, automated quality inspection, entertainment, vehicle safety, security, and HCI, among many others. This course offers a gentle introduction to computer vision, including image formation, camera imaging geometry, feature detection and matching, stereo, motion estimation and tracking, image classification and scene understanding. Both classical and the latest deep learning approaches will be covered.
The students will digest and practice their knowledge and skills by both homework and a midterm exam. They will also obtain in-depth experience with a particular topic through a final project. There will be no final exam.
Students should have taken the following courses or equivalent: Algorithms (EE 360C or CS 314/314H), Linear Systems and Signals (EE313 or BME 343), Probability and Random Processes (EE 351K or BME 335 or MATH 362K). Solid Knowledge of Linear Algebra will be instrumental to this course.
Coding experiences with Python are assumed. Previous knowledge of C/C++, MATLAB, or PyTorch/Tensorflow is very helpful, but not necessary.
Instructor Information
Name:
Dr. Zhangyang (Atlas) Wang
Telephone number:
512-471-1866
Email address:
Office hour time:
Tuesday 10:00am - 11:00am
Office hour location:
EER 6.886 (instructor office)
TA Information
TA 1 Name:
Email address:
Office hour time:
Monday 4:00pm - 5:00pm
Office hour location:
outside EER O’s Campus Café, outdoor seating area
TA 2 Name:
Email address:
Office hour time:
Friday 4:00pm - 5:00pm
Office hour location:
outside EER O’s Campus Café, outdoor seating area
Textbook and/or Resource Material
This course does not follow any textbook closely. Among many recommended readings are:
Grading Policies
Grading will be based on homework (20%; there will be 4 assignments), one mid-term exam (30%), and one final project (50%) (proposal 10% + mid-report 10% + presentation 5% + final report 15% + code review 10%).
Course Topics
8/23 Tuesday
8/25 Thursday
8/30 Tuesday
9/01 Thursday
9/06 Tuesday
9/08 Thursday
9/13 Tuesday
9/15 Thursday
9/20 Tuesday
9/22 Thursday
9/27 Tuesday
9/29 Thursday
10/04 Tuesday
10/06 Thursday
10/11 Tuesday
10/13 Thursday
10/18 Tuesday
10/20 Thursday
10/25 Tuesday
10/27 Thursday
11/01 Tuesday
11/03 Thursday
11/08 Tuesday
11/10 Thursday
11/15 Tuesday
11/17 Thursday
11/22 Thursday
11/24 Thursday
11/29 Tuesday
12/01 Thursday
- Computer Vision: Algorithms and Applications, Richard Szeliski (2010). 【Most Recommended for CV beginners】
-
First Principles of Computer Vision (YouTube Lecture), Shree Nayar (2021). 【Classical CV topics, especially non-ML】
-
Pattern Recognition and Machine Learning, Christopher M. Bishop (2006).【Classical ML】
-
Deep Learning, Ian Goodfellow, Yoshua Bengio and Aaron Courville (2016).
-
Diving into Deep Learning, Aston Zhang, Zack Lipton, Mu Li and Alex Smola (2019).
-
One project to receive the Best Project Award, voted by all class members. (+5%)
-
Projects in the novel, interdisciplinary domains (some examples: 5G/6G telecommunication, brain-computer interface, economics & markets, COVID-19, etc.), judged by the instructor. (+2%)
-
For late submission, each additional late day will incur a 10% penalty.
-
Request for re-grading an assignment must be made in writing within one (1) week of the graded assignment being made available to the class.
Class Logistics, and Fundemental Vision Theory [Slides 8/23]
(Extended Materials: MIT lecture on "Marr’s Level’s of Analysis")
Image Representation (1): From Our Brain to the Digital World
Image Representation (2): Gaussian and Laplacian Image Pyramids
TA Lecture: Q&A on Course Projects & Cracking the Coding! [Slides 9/01] [Jupyter Notebook]
Image Representation (3): Taking A Frequency Domain View [Slides 8/25 + 8/30 + 9/06]
(Extended Materials: Review of Sampling, Aliasing, and Fourier Analysis Methods)
Image Filtering (1): Pointwise, Convolution, and Beyond [Slides 9/08]
Image Filtering (2): Edge Detection, from Sober to Canny [Slides 9/13]
Cross-Image Matching (1): Detecting Key Points
Cross-Image Matching (2): Extracting Feature Descriptors from Key Points
Cross-Image Matching (3): Robust Matching of Descriptors [Slides 9/15 + 9/20 + 9/22]
(Extended Materials: Review of Linear Algebra, especially EVD, SVD and PCA)
Mapping 3D World to Image (1): Pinhole and Lens Cameras
Mapping 3D World to Image (2): Developing the Pinhole Camera Model
Mapping 3D World to Image (3): Geometric Camera Calibration [Slides 9/27 + 9/29 + 10/04]
(Extended Materials i: Solving Least Sqaures using SVD)
(Extended Materials ii: Geometric Camera Calibration in Action: An OpenCV Example)
Stereo Vision (1): Two-Camera Models, and Triangulation
Stereo Vision (2): Epipolar Geometry
Stereo Vision (3): Essential and Fundemental Matrices
Stereo Vision (4): Depth Estimation [Slides 10/06 + 10/11 + 10/13 + 10/18]
Video and Optical Flow (1)
Video and Optical Flow (2) [Slides 10/20 + 10/25]
Classical Machine Learning (1)
Classical Machine Learning (2) [Slides 10/27 + 11/01]
Image Classification: Bag-of-Words [Slides 11/03]
Object Detection and Segmentation (1)
(Extended Materials: The Viola-Jones Algorithm Explained in Details)
Object Detection and Segmentation (2) [Slides 11/08 + 11/10]
Deep Learning in Computer Vision (1)
Deep Learning in Computer Vision (2)
- No Class (Thanksgiving Break) -
- No Class (Thanksgiving Break) -
Deep Learning in Computer Vision (3)
Deep Learning in Computer Vision (4) [Slides 11/15 + 11/17 + 11/29 + 12/01]
Acknowledgement
Many materials included in this course are adapted from the existing teaching or tutorial slides, created by colleagues in CMU, Stanford, UIUC, UC Berkeley, GaTech, Brown, and more. The instructor owes many thanks for their generosity of sharing those materials publicly.