<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Aptos;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        font-size:12.0pt;
        font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
span.EmailStyle20
        {mso-style-type:personal-reply;
        font-family:"Aptos",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;
        mso-ligatures:none;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<div>
<p class="MsoNormal"><a href="https://statistics.yale.edu/" title=""Department of Statistics and Data Science""><span style="font-size:22.0pt;font-family:"Arial",sans-serif;color:#286DC0;text-decoration:none"><img border="0" width="150" height="49" style="width:1.5625in;height:.5104in" id="Picture_x0020_1" src="cid:0eb57de6-a2ae-4099-be34-ac97e0a7ec7d" alt="Department of Statistics and Data Science"></span></a><span style="font-size:11.0pt;font-family:"Arial",sans-serif;color:black">  
</span><b><span style="font-size:22.0pt;font-family:"Arial",sans-serif;color:#286DC0"><a href="https://statistics.yale.edu/" title="Home"><span style="color:#286DC0">Department of Statistics and Data Science</span></a></span></b><span style="font-family:"Arial",sans-serif;color:black"><o:p></o:p></span></p>
</div>
<div id="Signature">
<p><span style="font-size:11.0pt;font-family:"Arial",sans-serif"> </span><o:p></o:p></p>
<div>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">Zhuoran Yang, Yale University<o:p></o:p></span></p>
</div>
<div style="margin-top:6.0pt;margin-bottom:12.0pt">
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black"><img border="0" width="115" height="138" style="width:1.1979in;height:1.4375in" id="_x0000_i1025" src="https://statistics.yale.edu/sites/default/files/styles/user_picture_node/public/picture-2273-1693233160.jpg?itok=azUwtJuO"></span><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black"><o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">Date: Monday, March 31, 2025<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">Time: 4:00PM to 5:00PM<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">Location: Kline Tower, 13th Floor, Rm. 1327
</span><u><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:#286DC0"><a href="http://maps.google.com/?q=219+Prospect+Street%2C+New+Haven%2C+CT%2C+06511%2C+us"><span style="color:#286DC0">See map</span></a></span></u><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black"> <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">219 Prospect Street<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">New Haven, CT 06511<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">Webcast option:
<a href="https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=23d16765-e107-4f8d-992a-b233012bcdb3">
https://yale.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=23d16765-e107-4f8d-992a-b233012bcdb3</a><o:p></o:p></span></p>
</div>
<div>
<div style="margin-top:6.0pt;margin-bottom:12.0pt">
<p class="MsoNormal" style="background:white"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">Title: Unveiling In-Context Learning: Provable Training Dynamics and Feature Learning in Transformers<o:p></o:p></span></p>
</div>
<p class="MsoNormal" style="background:white"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">Information and Abstract: <o:p></o:p></span></p>
<div style="margin-bottom:12.0pt">
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">In-context learning (ICL) is a cornerstone of large language model (LLM) functionality, yet its theoretical foundations remain elusive due to the complexity of transformer
 architectures. In particular, most existing work only theoretically explains how the attention mechanism facilitates ICL under certain data models. It remains unclear how the other building blocks of the transformer contribute to ICL. To address this question,
 we study how a simple softmax transformer is trained to perform ICL on two synthetic tasks — (multi-task) linear regression and n-gram Markov chain. We show that transformer successfully learns these tasks in-context. More importantly, we will interpret the
 estimator represented by the learned transformer, show how transformers are trained by gradient-based dynamics, and how features emerge during training. Our theory is further validated by experiments. <o:p></o:p></span></p>
</div>
<div style="margin-bottom:12.0pt">
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif">This is joint work with Siyu Chen, Jianliang He, Xintian Pan, Heejune Sheen, and Tianhao Wang.<span style="color:black">.</span><o:p></o:p></span></p>
</div>
<div style="margin-bottom:12.0pt">
<p class="MsoNormal"><span style="font-size:14.0pt;font-family:"Arial",sans-serif;color:black">3:30pm - Pre-talk meet and greet teatime - 219 Prospect Street, 13 floor, there will be light snacks and beverages in the kitchen area.<o:p></o:p></span></p>
</div>
</div>
<p><span style="font-family:"Arial",sans-serif;color:black">For more details and upcoming events visit our website at
</span><span style="font-family:"Arial",sans-serif;color:#467886"><a href="https://statistics.yale.edu/calendar"><span style="color:#467886">https://statistics.yale.edu/calendar</span></a></span><span style="font-family:"Arial",sans-serif">.</span><o:p></o:p></p>
<p><span style="font-size:11.0pt;font-family:"Arial",sans-serif"> </span><o:p></o:p></p>
<p><span style="font-size:18.0pt;font-family:"Arial",sans-serif">Department of Statistics and Data Science</span><o:p></o:p></p>
<p><span style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">Yale University<br>
Kline Tower</span><o:p></o:p></p>
<p><span style="font-size:9.0pt;font-family:"Arial",sans-serif;color:black">219 Prospect Street<br>
New Haven, CT 06511</span><o:p></o:p></p>
<p><span style="font-size:11.0pt;color:#467886"><a href="https://statistics.yale.edu/"><span style="color:#467886">https://statistics.yale.edu/</span></a></span><o:p></o:p></p>
<p> <o:p></o:p></p>
</div>
</div>
</body>
</html>