Generative AI models

Written by Ramakrishnan Jonnagadla

Preamble:

In the realm of artificial intelligence, generative AI has emerged as a transformative force, revolutionizing every aspect of our lives, asking every individual to take a pause and relook at how things are being done. At the heart of this technological revolution lie large language models (LLMs), which harness the power of deep learning and massive datasets to generate human-quality text.

With over 325,000 models available on Hugging Face and countless more in development, the question arises: why should one consider using open source LLMs?

Open source or Proprietary LLMs:

Generative AI models can be broadly categorized into two types: proprietary and open source.

Proprietary (or closed-source) LLMs,

  • Are owned by a company who can control its usage
  • May include a license that restricts how the LLM can be used?
  • Examples: OpenAI’s GPT, Google’s Bard, Anthropic’s Claude 2

Open source LLMs,

  • Are free and available for anyone to access, fostering collaboration & innovation
  • Developers and researchers are free to use, improve or otherwise modify the generative AI model
  • Examples: Hugging Face, Meta’s LLaMA 2, Databricks’ Dolly, TII’s Falcon

It’s not true in every instance, but generally many proprietary LLMs are far larger in size than open source models. And specifically in terms of parameter size. Some of the leading proprietary LLMs extend to thousands of billions of parameters. Probably? We don’t necessarily know because, those LLMs and the parameter counts are proprietary. But “bigger isn’t necessarily better”.

However, the open source generative AI model ecosystem is showing promise in challenging the proprietary LLM business models in many use cases

Tech Note: Most frequently referenced terminologies in LLM space – Parameters & Tokens.
Parameters are a machine learning term for the variables present in the model on which it was trained that can be used to infer new content; (While training LLMs, the more the parameters, the more computational resources)

Tokens are the discrete units into which text is divided. Tokens can be as short as individual characters or as long as entire words; ; (While training LLMs, the more the tokens, the more memory needs)

Some pertinent questions on open source LLMs:

a. Why should one consider open source LLMs?
b. Who are the early adopters of open source LLMs?
c. What are some of the leading open source models available today?
d. What are the common risks associated with using them?

a. Why open source LLMs?

Transparency: Open source LLMs provide insights into their inner workings, allowing users to understand how they generate text. This transparency fosters trust and enables developers to identify and address potential biases.

Fine-tuning: Open source LLMs can be fine-tuned to specific tasks and domains, tailoring them to unique use cases. This makes them versatile tools for diverse applications.

Community Collaboration: Open source LLMs benefit from the collective knowledge and expertise of a global community of developers and researchers. This collaborative environment drives innovation and accelerates model development.

b.Early Adopters:

Healthcare: Open source LLMs are being used to develop diagnostic tools and optimize treatment plans, improving healthcare outcomes.

Finance: FinGPT, an Open source LLM specifically tailored for the financial industry, is assisting with financial modeling and risk assessment.

Aerospace: NASA has developed an Open source LLM trained on geospatial data, aiding in satellite imagery analysis and mission planning.
And many more industries are rapidly adopting LLMs to solve unique business problems.

Leading Models:

Companies like Huggingface maintains an open LLM leaderboard, and that tracks, ranks, and evaluates open source LLMs on various benchmarks like which LLM is scoring highest on the “Truthful AI Benchmark series”, which measures whether a language model is truthful in generating answers to questions.
The top spots on these leaderboards, they change frequently. And it’s quite fun to watch the progress these generative AI models are making. Some examples of top models include:

— Llama 2 Developed by Meta AI, Llama 2 encompasses a range of generative text models, from 70 billion to 7 billion parameters, offering flexibility for different applications.

— Vicuna Built upon the Llama model, it is specifically fine-tuned to follow instructions, making it ideal for task-oriented applications.

— Bloom Created by BigScience, it is a multilingual LLM developed collaboratively by over 1000 AI researchers, demonstrating the power of community-driven innovation.

d. Risks that need focus:

Although LLM outputs often sounds fluent and authoritative, they can be confidently wrong.
Hallucinations: LLMs can generate false or misleading information, especially when trained on incomplete or inaccurate data.

Bias: LLMs can reflect biases present in their training data, leading to discriminatory or unfair outcomes.

Security Concerns: LLMs can be misused for malicious purposes, such as leaking sensitive information or generating phishing scams.

Summary:

As Open source LLMs continue to mature and gain traction, it is evident that they are poised to play a transformative role in the future of AI. Their potential to democratize AI access, foster innovation, and solve real-world problems is immense. However, it is crucial to acknowledge and mitigate the associated risks to ensure responsible and ethical development and deployment of these powerful tools.