DSC 234: Data-Centric AI and AI Engineering (officially “Data Systems for ML”); 4 units.
This is a research-based course on data-centric aspects of the AI lifecycle, spanning development, deployment, and maintenance of AI applications. It is at the intersection of the areas of ML/AI, data management, and software systems. AI has long been ubiquitous in domains such as enterprise analytics, recommendation systems, social media analytics, and domain sciences. The rise of LLMs has made AI chatbots, RAG, and agentic applications pervasive for consumers as well. Students will learn about the landscape and evolution of such systems, the latest research, and some major open questions. This is a lecture-driven course with learning evaluation based on written quizzes and exams, peer discussions on cutting edge research papers, and hands-on AI programming assignments. This course is aimed primarily at MS students interested in building real-world AI applications, as well as PhD students interested in research in this space.
Prerequisites:
1) A course on ML algorithms is absolutely required. The course could have been at UCSD or elsewhere.
2) Python programming knowhow is also required. Any course or module on Python programming from anywhere suffices.
Or industrial experience or substantial project experience with applied ML/AI and Python programming suffices instead of above courses.
An introductory course on databases/data management and a course on NLP or LLMs specifically are also highly recommended but NOT strictly required.
Restricted to DS75 and DS76 students.