GPT-4 — a shift from ‘what it can do’ to ‘what it augurs

March 31, 2023

Context

Recently, Microsoft-backed OpenAI launched its artificial intelligence (AI) model GPT-4, an upgrade from GPT-3.5.
The article highlights the new features embedded in GPT-4 model, the challenges associated with it and what is augurs for the future.

What is the Meaning of Generative Pre-Trained Transformer (GPT)?

GPTs are machine learning algorithms that respond to input with human-like text. They have the following characteristics:
- Generative: They generate new information.
- Pre-trained: They first go through an unsupervised pre-training period using a large corpus of data. Then they go through a supervised fine-tuning (to specific tasks) period to guide the model.
- Transformers: They use a deep learning model (transformers) that learns context by tracking relationships in sequential data. Specifically, GPTs track words or tokens in a sentence and predict the next word or token.

About GPT-4

It is OpenAI's large multimodal language model that generates text from textual and visual input.
It can understand and produce language that is creative and meaningful, and will power an advanced version of the company’s sensational chatbot, ChatGPT.

Significance of GPT-4

It is more conversational and creative and is a remarkable improvement over its predecessor, GPT-3.5, which first powered ChatGPT.
- While GPT-3.5 could not deal with large prompts well, GPT-4 can take into context up to 25,000 words, an improvement of more than 8x.
Its biggest innovation is that it can accept text and image input simultaneously, and consider both while drafting a reply.
- For example, if given an image of ingredients and asked the question, “What can we make from these?”, GPT-4 gives a list of dish suggestions and recipes.
GPT-4 was also tested in several tests that were designed for humans and performed much better than average.
- For instance, in a simulated bar examination, it had the 90 percentiles, whereas its predecessor scored in the bottom 10%.
- GPT-4 also sailed through advanced courses in environmental science, statistics, art history, biology, and economics.
Its performance in language comprehension (in English and 25 other languages, including Punjabi, Marathi, etc) also surpasses other high-performing language models.
It can also purportedly understand human emotions, such as humorous pictures.
It has the ability to describe images that is beneficial for the visually impaired.
It can also do a lot of white-collar work, especially programming and writing jobs.
Wider use of language models like these will have effects on economies and public policy.

Limitations of GPT-4

It has failed to do well in advanced English language and literature, scoring 40% in both.
As ChatGPT-generated text infiltrated school essays and college assignments almost instantly after its release; its prowess now threatens examination systems as well.
It leaves manufacturing or scientific jobs relatively untouched.
GPT-4 is still prone to a lot of flaws similar to its predecessor as its output may not always be factually correct.
- This trait is referred to by OpenAI as “hallucination”.
- While much better at cognising facts than GPT-3.5, GPT-4 may still introduce fictitious information subtly.
OpenAI has also not been transparent about the inner workings of GPT-4 owing to reasons associated with both the competitive landscape and the safety implications of large-scale models like GPT-4.
- Thus, the GPT-4 technical report contains no further details about its architecture (including model size), hardware, training compute, dataset construction, training method, or similar.
Both ethical concerns and the environmental costs have been cited as the harm of large language models.
- There is also an opportunity cost imposed by a race for bigger models trained on larger datasets, that distracts from smarter approaches which look for meaning and train on curated datasets.

New Avenues Ahead

The advent of GPT-4 upgrades the question from what it can do, to what it augurs.
Microsoft Research mentioned observing “sparks” of artificial general intelligence in GPT-4.
- This implies a system that excels at several task types and can comprehend and combine concepts such as writing code to create a painting or expressing a mathematical proof in the form of a Shakespearean play.
Moreover, if intelligence is defined based on mental capability that involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience, GPT-4 already succeeds at four out of these seven criteria.
- It is yet to triumph master planning and learning.

Making an All-Inclusive GPT-4

GPT-4 has been trained on data scraped from the internet that contains several harmful biases and stereotypes.
- The internet has people from economically developed countries, of young ages and with male voices overrepresented, which Chat GPT intends to fix.
- OpenAI’s policy to patch up these biases thus far has been to create another model to moderate the responses, since it finds curating the training set to be infeasible.
However, potential holes in this approach include the possibility that the moderator model is trained to detect only the biases we are aware of, and mostly in the English language.
- This model may be ignorant of stereotypes prevalent in non-western cultures, such as those rooted in caste.
- As such, there is vast potential for GPT-4 to be misused as a propaganda and disinformation engine.
OpenAI has though assured that it has worked extensively to make it safer to use, such as refusing to print results that are obviously objectionable.

Other Language-models Underway

Apart from OpenAI’s models, AI company Anthropic has introduced a ChatGPT competitor named Claude.
Google recently announced PaLM, a model trained to work with more degrees of freedom than GPT-3.

Conclusion

There are global attempts being made to create a model with a trillion degrees of freedom.
However, these will be truly enormous language-models that arouse concerns about what they cannot do.