Mathematics & Machine Learning Seminar
As the currently most dominant machine learning architecture, Transformer models have realized remarkable advancements and are continuously reshaping the landscape of natural language processing. This talk will serve as a brief introduction to the fundamentals of Transformers. We will begin with the basics of Seq2Seq and then delve into the attention mechanism, which Transformers rely heavily on. Finally, we will review one of the most impactful papers in the field, "Language Models are Few-Shot Learners" by Tom B. Brown et al., emphasizing the design and performance of GPT-3.
For more information, please contact Math Department by phone at 626-395-4335 or by email at [email protected].
Event Sponsors