This talk is a tutorial talk, taking us from the use of sequence-based LLMs to Transformers (including the GPT models) as they are applied to natural language. We will go over what they’re used for, how they are trained, and how they are evaluated. Some resources will be pointed at for experimenting yourself. It is aimed at all levels of understanding, though some prior knowledge of probability and information theory is recommended.
Julian Hough: How neural (large) language models work as applied to natural language: from sequence models to transformers
014, Computational Foundry