Natural Language Processing

March 23, 2023
Kaleigh O'Merry

Structured Transformer Models for NLP

The field of natural language processing has recently unlocked a wide range of new capabilities through the use of large language models, such as GPT-4. The growing application of these models motivates developing a more thorough understanding of how and why they work, as well as further improvements in both quality and efficiency.

In this talk, I will present my work on analyzing and improving the Transformer architecture underlying today’s language models through the study of how information is routed between multiple words in an input. I will show that such models can predict the syntactic structure of text in a variety of languages, and discuss how syntax can inform our understanding of how the networks operate. I will also present my work on structuring information flow to build radically more efficient models, including models that can process text of up to one million words, which enables new possibilities for NLP with book-length text.

Contact Us

Find us

Email us

Phone support

Structured Transformer Models for NLP

Categories

Archives

Tag Cloud

Post List

Data Science Freshman Makes His First Clouds

FIRST VISITING SCHOLAR TO HALICIOĞLU INSTITUTE BRINGS EXPERTISE IN SENSORS AND DATA PRIVACY

UC San Diego seeks way to help baseball pitchers avoid arm injuries

Categories

Recent Posts

Contact Us

Find us

Email us

Phone support