CSCA 5502: Data Mining Pipeline
ÌýÌýPreview this courseÌýin the non-credit experience today!Ìý
Start working toward program admission and requirements right away.ÌýWork you complete in the non-credit experience will transfer to the for-credit experience when you upgrade and pay tuition. See How It Works for details.
Cross-listed with DTSA 5504
Course Type: Computer ScienceÌýElective
Specialization: Data Mining Foundations and Practice
Instructor:ÌýDr. Qin (Christine) Lv, Associate Professor of Computer Science
Prior knowledge needed:
- Programming languages: Basic to intermediate experience with Python, Jupyter Notebook
- Math: Basic experience with Probability and Statistics, Linear Algebra
- Technical requirements: Windows or Mac, Linux, Jupyter Notebook
Learning Outcomes
- Identify the key components of the data mining pipeline and describe how they're related
- Apply techniques to address challenges in each component of the data mining pipeline.
- Identify particular challenges presented by each component of the data mining pipeline.
Course Grading Policy
| Assessment | Percentage of Grade | AI Usage Policy | 
|---|---|---|
| Peer Review:ÌýData Mining Example | 10% | Limited | 
| Peer Review:ÌýData Mining Issues | 10% | Limited | 
| Programming Assignment:ÌýData Understanding | 20% | Limited | 
| Programming Assignment:ÌýData Preprocessing | 20% | Limited | 
| Programming Assignment:ÌýData Warehousing | 20% | Limited | 
| CSCA 5502 Data Mining Pipeline Final Exam | 20% | No AI Usage | 
Course Content
Duration: 7 hours
This week provides you with an introduction to the Data Mining Specialization and this course, Data Mining Pipeline. As you begin, you will get introduced to the four views of data mining and the key components in the data mining pipeline.Ìý
Duration: 5.5 hours
This week covers data understanding by identifying key data properties and applying techniques to characterize different datasets.Ìý
Duration: 5.25Ìýhours
This week explains why data preprocessing is needed and what techniques can be used to preprocess data. Ìý
Duration: 5Ìýhours
This week covers the key characteristics of data warehousing and the techniques to support data warehousing.Ìý
Duration: 1 hour
Final Exam Format: Proctored exam administered through ProctorU
This module contains materials for the final exam. This exam is a proctored exam administered through ProctorU.
- You will need to arrange for a time to take the proctored exam.
- It is a one-hour exam.
- You may submit your answers only once.
- The exam contains only multi-choice questions.
- There are no programming questions in the exam.
- You are not allowed to use any notes or access other websites when you take your exam.
- The exam tests conceptual understanding of the course materials. There is no need to memorize formulas.
Notes
- Cross-listed Courses: CoursesÌýthat are offered under two or more programs. Considered equivalent when evaluating progress toward degree requirements. You may not earn credit for more than one version of a cross-listed course.
- Page Updates: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click theÌýView on CourseraÌýbuttonÌýabove for the most up-to-date information.