What is Jais?

Sept. 1, 2023

Recently, a unit of Abu Dhabi AI company G42 has released 'Jais', the world's most advanced Arabic large language model.

About Jais:

  • It is a bilingual Arabic-English model that has been trained on a massive dataset of text and code.
  • It can be used for a variety of tasks, such as machine translation, text summarisation, and question-answering.
  • It was trained on the Condor Galaxy, the world's largest AI supercomputer, using 116 billion Arabic tokens and 279 billion English tokens.
  • It is also open-source, which means that anyone can use it or contribute to its development.
  • It is available to download on the Hugging Face machine learning platform.
  • The release of Jais is a significant step forward for the development of AI in the Arabic world.
  • Applications
    • Potential applications of Jais include Machine translation, which can be used to translate text from Arabic to English and vice versa.
    • This could be used to improve the accessibility of information to Arabic speakers, as well as to facilitate communication between Arabic speakers and speakers of other languages.
    • It adeptly distils extensive textual content, from news articles to research papers, into succinct and comprehensible summaries, enhancing accessibility and comprehension.
    • It also response to queries about text, enabling educational tools like responsive chatbots for students or robust customer service applications for client inquiries.

What are Large Language Models?

  • These are deep learning algorithms that can recognise, summarise, translate, predict, and generate content using very large datasets.
  • The popular ChatGPT AI chatbot is one application of a large language model. It can be used for a myriad of natural language processing tasks.
  • The nearly infinite applications for LLMs also include:
    • Retailers and other service providers can use large language models to provide improved customer experiences through dynamic chatbots, AI assistants and more.
    • Search engines can use large language models to provide more direct, human-like answers.
    • Life science researchers can train large language models to understand proteins, molecules, DNA and RNA.