COLUMBIA UNIVERSITY COMS 6113

Overview

Data management systems are the corner-stone of modern applications, businesses, and science (including data). If you were excited by the topics in 4111, this graduate level course in database systems research will be a deep dive into classic and modern database systems research. Topics will range from classic database system design, modern optimizations in single-machine and multi-machine settings, data cleaning and quality, and application-oriented databases. This semester’s theme will look at how learning has affected many classic data management systems challenges, and also how data management systems support and extends ML needs.

See FAQ for difference between 6113 and the other database courses.

Information

Grading

Recent Announcements

Tentative Schedule

Date

Topic

Notes

C1: Thu 01-19 Intro + Classic Systems Overview
Preview of systems and topics of the semester + relevance of classic systems
C2: Thu 01-26 Indexes
C3: Thu 02-02 Joins
C4: Thu 02-09 Query Optimization
C5: Thu 02-16 Cost Estimation
C6: Thu 02-23 Main Memory Query Execution (Vectorization)
C7: Thu 03-02 Main Memory Query Execution (Compilation) Plan+Team
C8: Thu 03-09 Dataflow Engines
C9: Thu 03-16 spring recess!
C10: Thu 03-23 Incremental Materialized Views
C11: Thu 03-30 In-DBMS ML Status Update
C12: Thu 04-06 App-DBMS interop + UDFs
Guest Speaker: Sujay Jayakar from Convex
C13: Thu 04-13 Learning over Joins
C14: Thu 04-20 Data Markets
Guest Speaker: Zachary Huang
C15: Thu 04-27 Projects

Course design inspired by