Exercise 2: Language Modelling with LSTMs

Lab session on Language Modelling using LSTMs

In this lab we look at how language models have traditionally been implemented using Recurrent Neural Networks. The lab illustrates the data preparation process of going from text to encoded tokens and highlights how a language model can be used to generate text.

The lab is mainly intended to be run on Google Colab, and you can open it by following this link Links to an external site.

Intended learning outcomes

  • build a keras model based on LSTM architecture
  • Understand how to go from text data to encoded tokenized data suitable for feeding into a recurrent network
  • Understand how a language model can be used to generate text contiuations based on a prompt.