Skip to main content
Browse by:

Solving Real-World Tasks with AI Agents

Shuyan Zhou, CMU Language Technologies Institute
Wednesday, February 07, 2024
12:00 pm - 1:00 pm
Shuyan Zhou, CMU Language Technologies Institute
Duke Computer Science Colloquium

Lunch will be served at 11:45 AM.

For years, my dream has been to create autonomous AI agents capable of carrying out tedious procedural tasks (e.g., arranging conference travel), allowing me to focus on more creative and exciting tasks. Modern AI models, especially large language models (LLMs) like ChatGPT, have suddenly brought us much closer to achieving such AI agents. But, has my dream already come true? In this talk, I will answer this question by delving into my systematic evaluation of AI agents in realistic tasks. The evaluation uncovers many critical limitations of AI agents, such as accurate grounding, long-term planning, and tool use. It suggests that LLMs are crucial yet early steps towards AI autonomy. To address these challenges, I will introduce my research of a more suitable "language" for AIs, which overcomes the inherent limitations of using natural language for task solving. Finally, I will discuss my work on teaching AI agents to learn new tools by reading the tool documentation rather than direct demonstrations.

Shuyan Zhou is a final-year PhD student at the Language Technologies Institute at CMU, advised by Graham Neubig. Her research in NLP and AI focuses on creating AI agents for real-world tasks, such as using computers and generating code. Her work has been recognized at top natural language processing and machine learning conferences and journals such as ICLR, ICML, ACL, EMNLP, and TACL.