CSCA 5832: Fundamentals of Natural Language Processing

Get a head start on program admission

ÌýPreview this courseÌýin the non-credit experience today!Ìý
Start working toward program admission and requirements right away.ÌýWork you complete in the non-credit experience will transfer to the for-credit experience when you upgrade and pay tuition. See How It Works for details.

  • Course Type: Elective
  • Specialization: Natural Language Processing: Deep Learning Meets Linguistics
  • Instructors:ÌýDr. James Martin
  • Prior knowledge needed: Students should consider their background in Java and begin appropriate tutorial study at a level needed to allow use of the language in course projects (suggested resources are provided).

Course Description

The field of natural language processing (NLP) aims at getting computers to perform useful and interesting tasks with human language. This course introduces students to the 3 pillars underlying modern NLP: probabilistic language models, simple neural networks with a focus on gradient based learning, and vector-based meaning representations in the form of word embeddings. At the end of the course, students will be able to implement and analyze probabilistic language models based on N-grams, text classifiers using logistic regression and gradient-based learning, and vector-based approaches to word meaning and text classification.

Learning Outcomes

  • Analyze a complex computing problem and to apply principles of computing and other relevant disciplines to identify solutions.
  • Design, implement, and evaluate a computing-based solution to meet a given set of computing requirements in the context of the program discipline.
  • Communicate effectively in a variety of professional contexts.
  • Recognize professional responsibilities and make informed judgments in computing practice based on legal and ethical principles.
  • Function effectively as a member or leader of a team engaged in activities appropriate to the program discipline.
  • Apply computer science theory and software development fundamentals to produce computing-based solutions.Ìý

Course Grading Policy

AssessmentPercentage of GradeAI Usage Policy
4 Auto-Graded Quizzes20% (5% each)Limited
3 Programming Assignments60% (20% each)Limited
Final Exam20%Limited

Course Content

Duration: 4Ìýhours, 16 minutes

This first week of Fundamentals of Natural Language Processing introduces the fundamental concepts of natural language processing (NLP), focusing on how computers process and analyze human language. You will explore key linguistic structures, including words and morphology, and learn essential techniques for text normalization and tokenization.

Duration: 5 hours, 51 minutes

This week explores foundational language modeling techniques, focusing on n-gram models and their role in statistical Natural Language Processing. You will learn how n-gram language models are constructed, smoothed, and evaluated for effectiveness.Ìý

Duration: 7Ìýhours, 6 minutes

This week introduces text classification and explores logistic regression as a powerful classification technique. You will learn how logistic regression models work, including key mathematical concepts such as the logit function, gradients, and stochastic gradient descent. The week also covers evaluation metrics for assessing classifier performance.

Duration: 7 hours, 18 minutes

This final week explores how words can be represented as vectors in a high-dimensional space, allowing computational models to capture semantic relationships between words. You will learn about both sparse and dense vector representations, including TF-IDF, Pointwise Mutual Information (PMI), Latent Semantic Analysis (LSA), and Word2Vec. The module also covers techniques for evaluating and applying word embeddings.

Duration: 1Ìýhours, 12 minutes

This module contains materials for the final exam. If you've upgraded to the for-credit version of this course, please make sure you review the additional for-credit materials in the Introductory module and anywhere else they may be found.

This exam is similar to the quizzes throughout the course:

  • The exam is non proctored.
  • It is a one-hour exam.
  • You may submit your exam only once.
  • The exam contains only multiple choice questions.
  • There are no programming questions in the exam.
  • You are not allowed to use any notes or access other websites when you take your exam.

Notes

  • Page Updates: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click theÌýView on CourseraÌýbuttonÌýabove for the most up-to-date information.