Three Out-Of-The-Box Transformer Models
Photo by Enrique Guzmán Egas on Unsplash
https://www.cratesboats.com/sites/wrd/Palace-v-Newcas-01.html
https://www.cratesboats.com/sites/wrd/Palace-v-Newcas-02.html
https://www.cratesboats.com/sites/wrd/burg-gen-der-im01.html
https://www.cratesboats.com/sites/wrd/burg-gen-der-im02.html
https://www.cratesboats.com/sites/wrd/lid-v-Lev-votv01.html
https://www.cratesboats.com/sites/wrd/lid-v-Lev-votv02.html
https://www.cratesboats.com/sites/wrd/Stra-Ren-directO1.html
https://www.cratesboats.com/sites/wrd/Stra-Ren-directO2.html
https://www.mercurycorp.com/jeds/hubs//Palace-v-Newcas-03.html
https://www.mercurycorp.com/jeds/hubs//Palace-v-Newcas-04.html
https://www.mercurycorp.com/jeds/hubs//burg-gen-der-im03.html
https://www.mercurycorp.com/jeds/hubs//burg-gen-der-im04.html
https://www.mercurycorp.com/jeds/hubs//lid-v-Lev-votv03.html
https://www.mercurycorp.com/jeds/hubs//lid-v-Lev-votv04.html
https://www.mercurycorp.com/jeds/hubs//Stra-Ren-directO3.html
https://www.mercurycorp.com/jeds/hubs//Stra-Ren-directO4.html
https://www.justlabels.co.za/cmc/Palace-v-Newcas-05.html
https://www.justlabels.co.za/cmc/burg-gen-der-im05.html
https://www.justlabels.co.za/cmc/lid-v-Lev-votv05.html
https://www.justlabels.co.za/cmc/Stra-Ren-directO5.html
https://www.cratesboats.com/sites/wrd/CP-vs-Ne-UKKK.html
https://www.mercurycorp.com/jeds/hubs/CP-vs-Ne-UKKK1.html
https://www.justlabels.co.za/cmc/CP-vs-Ne-UKKK2.html
https://www.cratesboats.com/sites/wrd/St-Re-FR-0.html
https://www.mercurycorp.com/jeds/hubs/St-Re-FR-01.html
https://www.justlabels.co.za/cmc/St-Re-FR-02.html
https://www.cratesboats.com/sites/wrd/Wol-Werder-Bre-0.html
https://www.mercurycorp.com/jeds/hubs/Wol-Werder-Bre-01.html
https://www.justlabels.co.za/cmc/Wol-Werder-Bre-02.html
https://www.cratesboats.com/sites/wrd/valla-vs-leva-0.html
https://www.mercurycorp.com/jeds/hubs/valla-vs-leva-01.html
https://www.justlabels.co.za/cmc/valla-vs-leva-02.html
https://www.mercurycorp.com/jeds/hubs/video-Crystal-Palace-Newcastle-v-en1.html
https://www.mercurycorp.com/jeds/hubs/video-Crystal-Palace-Newcastle-v-en2.html
https://www.mercurycorp.com/jeds/hubs/v-St-St-Re-v-lv-tv.html
https://www.mercurycorp.com/jeds/hubs/v-St-St-Re-v-lv-tv0.html
https://www.mercurycorp.com/jeds/hubs/V-v-p-r-va-vs-le-en-vi-1.html
https://www.mercurycorp.com/jeds/hubs/V-v-p-r-va-vs-le-en-vi-2.html
https://www.mercurycorp.com/jeds/hubs/vi-Wol-We-Br-v-01.html
https://www.mercurycorp.com/jeds/hubs/vi-Wol-We-Br-v-02.html
https://www.justlabels.co.za/cmc/video-Crystal-Palace-Newcastle-v-en3.html
https://www.justlabels.co.za/cmc/video-Crystal-Palace-Newcastle-v-en4.html
https://www.justlabels.co.za/cmc/video-Crystal-Palace-Newcastle-v-en5.html
https://www.justlabels.co.za/cmc/v-St-St-Re-v-lv-tv1.html
https://www.justlabels.co.za/cmc/v-St-St-Re-v-lv-tv2.html
https://www.justlabels.co.za/cmc/V-v-p-r-va-vs-le-en-vi-3.html
https://www.justlabels.co.za/cmc/V-v-p-r-va-vs-le-en-vi-4.html
https://www.justlabels.co.za/cmc/vi-Wol-We-Br-v-03.html
https://www.justlabels.co.za/cmc/vi-Wol-We-Br-v-lv.html
https://www.mercurycorp.com/jeds/telor/video-Crystal-Palace-Newcastle-v-en-gb-P7f2.html
https://www.cratesboats.com/sites/vtube/video-Crystal-Palace-Newcastle-v-en-gb-1yri.html
https://www.cratesboats.com/sites/vtube/video-Crystal-Palace-Newcastle-v-en-gb-P7f.html
https://www.cratesboats.com/sites/vtube/video-Crystal-Palace-Newcastle-v-en-gb-P7f1.html
https://www.cratesboats.com/sites/vtube/UM-v-Lib5.html
https://www.justlabels.co.za/hty/video-Crystal-Palace-Newcastle-v-en-gb-1olp.html
https://www.justlabels.co.za/hty/video-Crystal-Palace-Newcastle-v-en-gb-1ply.html
https://www.justlabels.co.za/hty/video-Crystal-Palace-Newcastle-v-en-gb-1tpm.html
https://www.justlabels.co.za/hty/video-Crystal-Palace-Newcastle-v-en-gb-1trn.html
https://www.justlabels.co.za/hty/UM-v-Lib6.html
https://www.mercurycorp.com/jeds/telor/UM-v-Lib1.html
https://www.mercurycorp.com/jeds/telor/UM-v-Lib2.html
https://www.mercurycorp.com/jeds/telor/UM-v-Lib3.html
https://www.mercurycorp.com/jeds/telor/UM-v-Lib4.html
https://www.justlabels.co.za/hty/Neb-v-Iow2.html
https://www.justlabels.co.za/hty/Iow-Stat-v-Tex-nca3.html
https://www.justlabels.co.za/hty/Iow-Stat-v-Tex-nca4.html
https://www.mercurycorp.com/jeds/telor/Neb-v-Iow-1plu.html
https://www.mercurycorp.com/jeds/telor/Iow-Stat-v-Tex-nca-1pol.html
https://www.mercurycorp.com/jeds/telor/Iow-Stat-v-Tex-nca2.html
https://www.cratesboats.com/sites/vtube/Neb-v-Iow1.html
https://www.cratesboats.com/sites/vtube/Iow-Stat-v-Tex-nca1.html
https://www.cratesboats.com/sites/vtube/Iow-Stat-v-Tex-nca-1opl.html
https://www.mercurycorp.com/jeds/telor/Liberty-vs-UMass-ncaaf-01.html
https://www.mercurycorp.com/jeds/telor/video-Liberty-vs-UMass-ncaaf-02.html
https://www.mercurycorp.com/jeds/telor/video-Liberty-vs-UMass-q1.html
https://www.mercurycorp.com/jeds/telor/video-Liberty-vs-UMass-q1.html
https://www.mercurycorp.com/jeds/telor/video-Liberty-vs-UMass-q2.html
https://www.cratesboats.com/sites/vtube/Texas-vs-Iowa-State-livQQ1.html
https://www.cratesboats.com/sites/vtube/Texas-vs-Iowa-State-livQQ2.html
https://www.cratesboats.com/sites/vtube/Texas-vs-Iowa-State-livQ0.html
https://www.cratesboats.com/sites/vtube/Texas-vs-Iowa-State-li.html
https://www.justlabels.co.za/cmc/Nebraska-v-Iowa-1.html
https://www.justlabels.co.za/cmc/Nebraska-v-Iowa-liv1.html
https://www.justlabels.co.za/cmc/Nebraska-v-Iowa-liv2.html
https://www.justlabels.co.za/cmc/Nebraska-v-Iowa-rwe1.html
https://www.justlabels.co.za/cmc/Nebraska-v-Iowa-rwe2.html
How many lines of code does it take to use one of Google AI’s top-performing language models and apply it to your own text summarization project?
Or how about intelligent language generation? You start writing, and OpenAI’s GPT-2 finishes — how many lines of code?
Must be a lot? No — both can be built in just seven lines of Python.
In my opinion, that is absolutely ridiculous. Seven lines of code to apply models that are the culmination of decades of work, produced by some of the most intelligent people on Earth, using millions of dollars of research funds.
Already, my mind is blown. It’s incredible, but it’s true. This article is a list of cheat codes to apply three of these spectacular models to your own work — with near-zero overhead.
> Text Summaries with T5> Sentiment Analysis with BERT> Language Generation with GPT-2
Enjoy!
Summarization with T5
Our first model covers text summarization. This consists of taking a large amount of text and compressing it down into a smaller body of text — while preserving the most important details (covered in the video below).
This seems like it should be an astonishingly complex task, an unlikely feat to pull it off without already having plenty of experience beforehand.
But, that is not the case.
In-Full
The model is incredibly easy to set up and use. There are three key steps:
- Tokenization of text to input IDs
- Processing by the T5 model
- Decoding of the model outputs back to text
All of this in code is:
That’s all it takes! Let’s take a chunk of text from the Wikipedia page about Winston Churchill and see what happens when we feed it into our summarizer:

T5 summarizer output.
With just a few lines of code, the model has brilliantly summarized what I would perceive as some of the most important parts of the text.
A drawback — in this case — is that the model has solely focused on the second paragraph. Missing important information present in the third. Nonetheless, for an out-of-the-box solution — it’s pretty impressive!
Text Classification with BERT
Text classification is simple but incredibly useful. One of the most common use-cases for natural language classification is sentiment analysis.
We will use sentiment analysis in our example — however, we can use this same approach for any text classification task.
Unlike the other two use-cases, we do need to train our model — so that it understands what we are asking it to classify. The steps are slightly different because of this:
- Preprocessing (including tokenization) of the data
- Training of the model
- Classification of test/real data using the model
Model
The model itself is incredibly easy to define and follows the same structure as the other models.
Pre-Processing
It’s incredibly easy to build the model; there’s almost nothing to it. However, we need to make sure we are feeding the right data into our model.
We will use the movie review sentiment dataset from here — using train.tsv to train our dataset.
The model expects tokenized IDs and attention mask tensors as input, and one-hot encoded target labels.
Both input tensors are built using the encode_plus
method. The IDs are returned by default, and we add return_attention_mask=True
to return the attention mask tensor too:
One-hot encoding of our target labels is handled easily with nothing more than Numpy:
Because we’re using TensorFlow — we can easily format, shuffle, and batch our data using the built-in Dataset object:
Train and Predict
We need to train our model this time too. First, we compile our model, then begin training with model.fit
:
Now we have our text classification model, built with Google’s BERT! We can make predictions with:
This model isn’t as clean-cut due to the need to train our classifier, but other than this, there is nothing more we need to do — and the level of accuracy possible on language classification tasks using BERT is simply phenomenal.
I have written about this use-case in more depth here.
Language Generation with GPT-2
Last but not least is language generation with GPT-2 (covered in the video above). The little brother of GPT-3, with 1.5B parameters, GPT-2 is hardly small — and only seems so when compared to GPT-3’s 175B parameters.
We cannot directly use GPT-3, but GPT-2 is widely available, and we can use it with incredible ease.
We will be building a language generator with it. Feeding a sentence or paragraph into the model to generate rational, coherent language.
Model
We take a very similar approach as with the other models in this article:
- Tokenization of text to input IDs
- Generation of language by the GPT-2 model
- Decoding of the model output back to text
All of this is done with just:
When we apply this code to a short extract, again from Churchill’s Wikipedia page — we get this:

The output of GPT-2 language generation produced from the highlighted text (the input).
We can tune the randomness and coherence of the output using temperature
and top_k
— which occasionally offers us some pretty entertaining outputs.

An alternative history where Chamberlain’s mother took over as prime minister after his premature death in 1918 — and Sir Richard Branson served on Churchill’s International Energy and Scientific Research Council in 1958.
That’s it for these three incredibly easy to use transformer models — applying some of the most powerful language models on the planet in just a few lines of code.
More importantly, we have covered just three models. There are a huge number of models out there covering a wide range of applications — such as question-answering, translation, and name-entity recognition — to name a few.
I hope you’ve enjoyed this article. If you have any questions or feedback — feel free to reach out on Twitter or in the comments below.
Thanks for reading!