If you are interested in taking this course, please fill out THIS SHORT FORM. Due to the small class size, we will use the answers to balance student backgrounds and expertise. To ensure commitment, we are not currently accepting audits.
LLMs have opened new possibilities of automated agents that plan and complete tasks on the user’s behalf. Such agents have the potential to usher in a new industrial revolution by automating organizational processes. However, agents are currently limited to soft-edge tasks that have large tolerances for error, and are too unreliable for hard-edge tasks, like in healthcare or enterprises, where accuracy and reliability are paramount. In short, what does it take for agents to be used in enterprises?
This graduate-level course will cut across the technology stack to examine the research questions that need to be answered for agents to be possible in real tasks that matter. Each session will review 1-3 papers or systems, and discuss research opportunities that arise from the gap between existing research and enterprise requirements. Topics will span systems (data systems and ML systems), AI (LLMs, agent-based planning), HCI, and theory (reinforcement learning, markets).
Broad questions include
1/21: Introduction & a quick history of agents - Eugene & Kostis
1/23: Tutorial: Agents Overview - Xiao Yu, Columbia Toggle Bio
1/28: Tutorial: Agent Planning - Xiao Yu, Columbia
1/30: Now: SWEBench - John Yang, Stanford
02/04: Now: Agents at Google - Fatma Ozcan, Google Research
02/06: Use Case: Bureaucracy Jeffrey Schlegelmilch, National Center for Disaster Preparedness
02/13: Use Case: Agents in Systems Optimization - Shreya Shankar PhD, UC Berkeley
02/18: Now: Agent Frameworks - Phil Calçado, Outropy
02/20: Now: Simulation for embodied agents - Yunzhu
02/25: Now: Servings - Kostis Kaffes
02/27: Use Case: TBA
3/4: TBA
3/6: Models: Neurosymbolic training - Baishakhi Ray, Columbia
3/11: HAI: Hand-offs with humans and context - Lydia Chilton, Columbia
3/13: Use Case: TBA
3/25: Models: Planning - Shipra Agrawal, Columbia
3/27: Systems: Lineage and Data-flow policies - Eugene Wu, Columbia
4/1: Use Case: Coding (AutoCodeRover) - Yuntong Zhang, NUS
4/3: HAI: Evaluating agent outcomes - TBA
4/8: HAI: Schema and Process Induction - TBA
4/10: Systems: ML for systems configuration - TBA
4/15: Systems: Performance Hints - TBA
4/17: Models: Long context LLM - Kuntai Du, UChicago
4/22: Systems: Monitoring - TBA
4/24: TBA
4/29: Presentations
5/1: Presentations