Guessing the next word is pure gold, or how language models work and how to build your own ChatGPT

19th edition

2024

Edition: 18th SFI Academic IT Festival

Date: March 30, 2023, 3:30 p.m.

Type: Lectures

Category: AI & data

Language: Polish

Speaker

Aleksander Smywiński-Pohl

Abstract

The aim of the lecture is to present the most important methods and techniques used to train large language models. The lecture will begin with the presentation of the key mechanism used by neural networks capable of analyzing and synthesizing text, i.e. modeling the language in causal and masked versions. Next, the most important mechanism of transformer-type neural networks, i.e. the mechanism of attention, will be presented. It will be compared with other neural architectures, recursive networks in particular. In the next part of the lecture, the author will present the operation of the RLHF mechanism behind the spectacular success of the ChatGPT tool. The author will also indicate the most important obstacles related to the "homemade" creation of language models. The lecture will be concluded with considerations on legal issues related to the creation of artificial intelligence models, including in particular the problem of copyright.

Duration

60 min