COLUMBIA UNIVERSITY COMS 6113

Overview

Data management systems are the corner-stone of modern applications, businesses, and science (including data). If you were excited by the topics in 4111, this graduate level course in database systems research will be a deep dive into classic and modern database systems research. Topics will range from classic database system design, modern optimizations in single-machine and multi-machine settings, data cleaning and quality, and application-oriented databases.

The class places a heavy emphasis on paper reading and discussion. The point is to practice reading papers critically, discussing facets of the papers, implementing ideas in research papers, and conducting research. As such, students will be expected to read papers in depth and conduct a semester-long research project.

Ideally, you will be comfortable with reading code that is not yours, open to trying different software systems, and willing to actively participate in and lead discussions.

Course Expectations

Participation

Participation is mandatory. See the first class’ slides

Slack

Students are given a role to focus on in the next class’s discussion.

Project (semester long)

You will pursue a semester long research project related to this course. The project is a significant part of the course grade.

FAQ

What’s the relationship between the different database courses? (solid lines are prereqs, blue dashed lines are recommended).

What’s the difference from 6111 Advanced Database Systems?

What’s the difference from 4112 Database System Implementation?

Can I take both 6111 and 6113 for credit?

Collaboration/Copying Policy

Refer to Columbia’s academic honesty policy if you are at all unsure.

You must write all the code you hand in for the programming assignments, except for code that we give you as part of the assignment. You are not allowed to look at anyone else’s solution, you are not allowed to look at solutions from previous years, and you are not allowed to look at solutions from other universities. You may discuss the assignments with other students, but you may not look at or use each other’s code. The same rule holds for the question assignments: you must write all answers yourself, not look at others’ answers, but you can discuss the questions with others at a high level. You are also not allowed to look for or at solutions to the assignments on the Internet. You can search for small pieces of code that solve small parts of your assignments, and you may use tutorials to learn, however if you copy any code from anywhere, we request that you identify the origin in a comment in the code.

Your reviews must be written originally, and be based on your own understanding and thoughts about the reading. Copying or paraphrasing content written by others is not allowed.

Be advised that we will be running all assignments through the MOSS code similarity tool, which is very accurate even after significant amount of obfuscation, so we will identify and report anyone who attempts to breach this rule. We will include in our tests solutions from previous years both from Columbia and elsewhere. Both copy-ers and copy-ees will be punished. You are responsible for protecting your code and homeworks from others and not leaving them lying around in publicly open directories.

Finally, you may discuss the questions for each assignment with other students, but you may not look at other students’ answers. You must write your answers yourself.