[Sds-seminars] [Sds-announce] S&DS Seminar: Zhuoran Yang, 03/31/25, 4pm-5pm, KT 13th Floor, Rm. 1327, "Unveiling In-Context Learning: Provable Training Dynamics and Feature Learning in Transformers"
Torres, Elizavette
elizavette.torres at yale.edu
Mon Mar 31 14:03:43 EDT 2025
[Department of Statistics and Data Science]<https://statistics.yale.edu/> Department of Statistics and Data Science<https://statistics.yale.edu/>
Zhuoran Yang, Yale University
[https://statistics.yale.edu/sites/default/files/styles/user_picture_node/public/picture-2273-1693233160.jpg?itok=azUwtJuO]
Date: Monday, March 31, 2025
Time: 4:00PM to 5:00PM
Location: Kline Tower, 13th Floor, Rm. 1327 See map<http://maps.google.com/?q=219+Prospect+Street%2C+New+Haven%2C+CT%2C+06511%2C+us>
219 Prospect Street
New Haven, CT 06511
Webcast option: https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=23d16765-e107-4f8d-992a-b233012bcdb3
Title: Unveiling In-Context Learning: Provable Training Dynamics and Feature Learning in Transformers
Information and Abstract:
In-context learning (ICL) is a cornerstone of large language model (LLM) functionality, yet its theoretical foundations remain elusive due to the complexity of transformer architectures. In particular, most existing work only theoretically explains how the attention mechanism facilitates ICL under certain data models. It remains unclear how the other building blocks of the transformer contribute to ICL. To address this question, we study how a simple softmax transformer is trained to perform ICL on two synthetic tasks - (multi-task) linear regression and n-gram Markov chain. We show that transformer successfully learns these tasks in-context. More importantly, we will interpret the estimator represented by the learned transformer, show how transformers are trained by gradient-based dynamics, and how features emerge during training. Our theory is further validated by experiments.
This is joint work with Siyu Chen, Jianliang He, Xintian Pan, Heejune Sheen, and Tianhao Wang..
3:30pm - Pre-talk meet and greet teatime - 219 Prospect Street, 13 floor, there will be light snacks and beverages in the kitchen area.
For more details and upcoming events visit our website at https://statistics.yale.edu/calendar.
Department of Statistics and Data Science
Yale University
Kline Tower
219 Prospect Street
New Haven, CT 06511
https://statistics.yale.edu/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.yale.edu/pipermail/sds-seminars/attachments/20250331/ef43e788/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-Department
Type: image/jpg
Size: 2925 bytes
Desc: Outlook-Department
URL: <http://mailman.yale.edu/pipermail/sds-seminars/attachments/20250331/ef43e788/attachment.jpg>
-------------- next part --------------
--
Sds-announce mailing list
Sds-announce at mailman.yale.edu
https://mailman.yale.edu/mailman/listinfo/sds-announce
More information about the Sds-seminars
mailing list