Course: COS 424
Instructor: Barbara Engelhardt
S 2018
Description of Course Goals and Curriculum
I believe that this course is very focused on developing you as an independent thinker and researcher and not just a student who can complete structured problem sets. This class wants you to be able to work with a real world dataset, formulate your own plan of attack, and adapt your plan as obstacles arise. This aspect of the class was also its most most challenging because it is unlike any other COS or engineering class I had ever taken. It required me to take the initiative to teach myself many aspects of probability theory, statistics, Python (the language you will complete the code portion of the assignments in), and even some linear algebra. Though this process was demanding, it was extremely rewarding and taught me independent research skills that I think will be invaluable to future work and projects in CS. I think this course also deemphasizes the need to "get the right answer", as there isn't any single best approach or end result-- a fact that can be overwhelming or hard to work towards but also allows for more creativity and freedom. The course material covered in lectures began with pretty basic classification and regression models and gradually built itself towards more complex models and methods. Each successive project demanded more complex techniques and consideration of more unique features and demands of the dataset, and therefore more research and time.Learning From Classroom Instruction
If you do not have a strong background in statistics or probability theory, the lectures can be overwhelming and very fast paced. A lot of pretty involved content is covered quite quickly and may require a separate independent review if you want to be able to keep up with lectures. However, understanding all this theory is not crucial to being able to do well in the class and on the assignments-- or get a lot out of this class. The assignments (discussed below) will require you to research the relevant methods and understand them well enough to be able to explain them in the paper and justify their use in your specific application. Weekly quizzes encourage you to keep up with the assigned readings that are then addressed in more detail in the second lecture of the week. These readings provide an additional resource for completing the projects, but again, are quite dense and are not the main focus of this course. Precepts are focused on reviewing the weekly quiz and going over topics specific to the current assignment.Learning For and From Assignments
This course's workload and learning is almost entirely comprised of the four project-assignments you will complete over the semester (3 assignments + 1 final project of your own persuasion). There are short weekly quizzes that encourage you to keep up on reading material, but are not very demanding are a small percentage of your final grade. Assignments in this class are very application based and extremely unlike those of many entry-level engineering and COS classes. There is little predefined structure or path. Instead, you are given a dataset, some guidelines for and deliverables, and knowledge of what was covered in class which is helpful to guiding your approach to the assignment. There is little structure or predefined course of action-- these assignments really challenge you to do more than just the minimum and encourage you to explore extensions (thorough more complex classifiers/regressors, feature selection, cross-validation, etc.) that you think will be fruitful. This isn't a class that is focused on "getting the right answer". These projects encourage you to focus on process and think about why you choose the methods you do in approaching a problem. If you are used to having a clear plan of action to complete assignments, these projects may be a little uncomfortable at first, but the course staff are very willing to help give you more direction if you need it. I found that 99% of my learning in this class happened through doing, through pushing myself to research more, understand extensions, and problem solve through my plan of action. Finally, this is definitely a class that you will get more from if you start the assignments early (do NOT attempt to star these assignments the night before!) and take time to think through the real-world demands and connections of the datasets.External Resources
Many of the topics covered in this course are well documented by machine learning, statistics, probability, and computer science courses and departments around the world. A quick Google search should reveal a number of different explanations for every topic area. Take the time to parse through these to find the explanation that makes most sense to you. Learning to read research papers also struck me as an important part of this course. It's quite fascinating to read through the original documentation of the algorithms you'll implement and necessary to being able to effectively complete the paper for each assignment. Don't be afraid to pore over old papers. Research papers may take a couple of read-throughs to make sense, but don't let that scare you. Follow the citation trail of that paper to find more information on extensions and adaptations of the topic in question.What Students Should Know About This Course For Purposes Of Course Selection
This course can be used to fulfill one Applications requirement for COS concentrators. MAT 202, ORF 245, and COS 126 are listed as prerequisites for this class. The deeper background in statistics and probability theory you can enter with, the more I believe you will get out of this class. Knowledge of linear algebra is very useful as well, as ultimately, machine learning is not just about being able to write code from packages that do the work for you, but also being able to understand the math behind how these algorithms work. However, even with just the basic prerequisites, it is possible to get a lot out of this class as long as you are willing and able to put the time and effort into understanding the theory behind the code. There are only a handful of ML courses offered at Princeton and to my understanding, COS 424 is one of the more hands-on, application based options.Fundamentals of Machine Learning