GPT-3 Learning Model Decoded

By Dhnesh on Nov. 3, 2020, 10:29 a.m.

Generative Pre-trained Transformer 3 (GPT-3) is the third generation model in the GPT Series, which uses deep learning to produce human-like text. It is basically a language prediction model.

Previous Models:
Previous models were trained on a large corpus of texts. We humans can understand and do a new task fr om a few steps or given instructions. But current NLP systems or engines are finding it hard to do so in the language field. Some models are task-agnostic when scaled up efficiently with state-of-the-art fine-tuning approaches. So GPT-2 Language Model and the former were Few-Shot Learners. They needed vast amount of annotated data for getting trained on a specific task. They could not generalise for the tasks other than what they have been trained for.


GPT-3 Architecture
GPT-2 under fitted the WebText dataset and training for more time could have reduced the perplexity even more.Open AI assembled the GPT-3 model with 175 billion boundaries. This model had 10 times more params than Microsoft's ground-breaking Turing NLG language model and 100 times more parameters than GPT-2. GPT-3 is carefully prepared on an unlabeled text dataset. Words or expressions are arbitrarily taken out from the content, and the model should figure out how to fill them in utilizing just the encompassing words as context. GPT-3 itself is based on transformer learning neural-network approach. It performed better than any other language model at a variety of tasks which included summarizing texts and answering questions. It performed in a way that is better than some other language model at an assortment of errands which included summing up writings and addressing questions.
The largest version GPT-3 175B or “GPT-3” has 175 B Parameters, 96 attention layers and 3.2 M batch size.


GPT-3 Output:
With GPT-3 many of the NLP tasks discussed earlier can be done without any fine-tuning, gradient or parameter updates which makes this model Task-Agnostic. OpenAI was very much interested in spreading the hype and showing amazing samples of cool applications. It had also proved its competencies when a college kid created a fake blog using GPT-3, making people believe that humans wrote it. As a result of its humongous size, GPT-3 can do what no other model can do (well): perform specific tasks without any special tuning. You can ask GPT-3 to be a translator, a programmer, a poet, or a famous author, and it can do it with its user (you) providing fewer than 10 training examples. OpenAI announced that users could request access to its user-friendly GPT-3 API—a "machine learning toolset"—to help OpenAI "explore the strengths and limits" of this new technology.


                  While the sky is the limit. I would say AI doesn't have to be evil to destroy humanity.