Anthropic Releases Claude 2.1, Which Exceeds GPT-4 Turbo In Context Length

Anthropic, a research company that aims to build trustworthy and aligned artificial intelligence, has announced the release of its latest large language model (LLM), Claude 2.1. The new model surpasses OpenAI’s GPT-4 Turbo in context length, allowing it to process and generate longer texts with more coherence and accuracy. This article will explore the features and impact of Claude 2.1, and how it compares to its competitors.

Anthropic’s Latest LLM: Claude 2.1

Exceeding GPT-4 Turbo in Context Length

Our new model Claude 2.1 offers an industry-leading 200K token context window, a 2x decrease in hallucination rates, system prompts, tool use, and updated pricing.

Claude 2.1 is available over API in our Console, and is powering our https://t.co/uLbS2JNczH chat experience. pic.twitter.com/T1XdQreluH
— Anthropic (@AnthropicAI) November 21, 2023

One of the main challenges of LLMs is to maintain consistency and relevance across long texts, such as documents, books, or essays. The context length of a model refers to the amount of text it can consider at a time, both as input and output. A longer context length enables the model to handle more complex and diverse tasks, such as summarizing, translating, or writing.

OpenAI’s GPT-4 Turbo, which was released in November 2023, introduced a 128K context window, equivalent to 300 pages of text in a single prompt. This was a significant improvement over the previous GPT-4 model, which had a 32K context window. However, Anthropic’s Claude 2.1 has surpassed GPT-4 Turbo by offering a 200K context window, equivalent to 500 pages of text. This means that Claude 2.1 can process and generate longer texts with more coherence and accuracy than GPT-4 Turbo.

Improved Capabilities

Claude 2.1 is not only longer, but also smarter than its predecessor, Claude 2. The new model has improved performance on various tasks, such as natural language understanding, natural language generation, coding, math, and reasoning. For example, Claude 2.1 scored 79.3% on the multiple-choice section of the Bar exam, up from 76.5% with Claude 2. It also scored above the 95th percentile on the GRE reading and writing exams, and similarly to the median applicant on quantitative reasoning.

Claude 2.1 is also more versatile, as it can handle multimodal inputs and outputs, such as text, images, audio, and video. It can generate and edit images given a natural language prompt, similar to OpenAI’s DALL·E. It can also convert text into natural-sounding spoken audio, or vice versa, similar to OpenAI’s TTS and Whisper. Moreover, Claude 2.1 can follow natural language instructions to help users with various tasks, such as writing, coding, researching, or learning.

Increased Accuracy

Claude 2.1 is also more accurate and reliable than its predecessor, Claude 2. The new model has reduced the rate of errors, hallucinations, and inconsistencies in its outputs. It has also increased its resistance to jailbreaks, which are attempts to make the model produce harmful or offensive outputs. Anthropic has used a variety of safety techniques, such as adversarial filtering, alignment verification, and human feedback, to improve the safety of Claude 2.1. In an internal red-teaming evaluation, Claude 2.1 was 3x better at giving harmless responses compared to Claude 2.

The Impact on the AI Industry

Competing With OpenAI

Anthropic’s Claude 2.1 is a direct competitor to OpenAI’s GPT-4 Turbo, as both models are state-of-the-art LLMs that can perform a wide range of tasks. However, Claude 2.1 has an edge over GPT-4 Turbo in terms of context length, which gives it more flexibility and capability. Moreover, Claude 2.1 is more trustworthy and aligned, as it is designed to avoid harmful or biased outputs, and to respect the values and preferences of its users.

Anthropic and OpenAI have different approaches and goals for developing and deploying LLMs. Anthropic is a research company that focuses on building trustworthy and aligned AI, and it is funded by philanthropic donations and grants. OpenAI is a research organization that aims to ensure that AI is beneficial for humanity, and it is funded by both donations and commercial products. Anthropic is more selective and cautious about who can access and use its models, while OpenAI is more open and inclusive. Anthropic is more transparent and collaborative with the AI community, while OpenAI is more secretive and competitive.

Raising Billions in Funding

Anthropic’s Claude 2.1 is also a testament to the company’s success and growth in the AI industry. Anthropic was founded in 2021 by a group of former OpenAI researchers, led by Dario Amodei and Daniela Amodei. The company has attracted some of the most talented and influential AI researchers and engineers, such as Geoffrey Hinton, Ilya Sutskever, and Paul Christiano. The company has also raised billions of dollars in funding from some of the most prominent and visionary investors, such as Reid Hoffman, Peter Thiel, Dustin Moskovitz, and Vitalik Buterin.

Anthropic’s Claude 2.1 is the result of the company’s ambitious and rigorous research agenda, which aims to advance the frontiers of AI while ensuring its safety and alignment. The company has published several papers and blog posts on its research topics, such as adversarial filtering, alignment verification, and human feedback. The company has also participated in several AI competitions and benchmarks, such as Codex HumanEval, GSM8k, and LLM Leaderboard.

Features of Claude 2.1

200K Context Window

Claude 2.1 has a 200K context window, which means that it can consider up to 200,000 tokens of text at a time, both as input and output. This is equivalent to 500 pages of text in a single prompt. This feature enables Claude 2.1 to handle longer and more complex texts, such as documents, books, or essays, with more coherence and accuracy. It also allows Claude 2.1 to perform more diverse and challenging tasks, such as summarizing, translating, or writing.

API Tools for Developers

Claude 2.1 is accessible via an API, which allows developers to integrate the model into their applications and products. The API provides various tools and options for developers, such as:

JSON mode: A mode that allows developers to specify the input and output format of the model using JSON, such as the number of tokens, the temperature, the top-p, the stop sequence, and the logprobs.
System prompts: A set of predefined prompts that allow developers to easily instruct the model to perform common tasks, such as summarizing, translating, writing, or answering questions.
Parallel function calling: A feature that allows developers to call multiple functions on the same input in parallel, such as generating multiple summaries or translations of the same text.

Enhanced System Prompts

Claude 2.1 has enhanced its system prompts, which are natural language instructions that tell the model what to do. The system prompts are designed to be intuitive and easy to use, and they cover a wide range of tasks, such as:

Summarize: A prompt that tells the model to summarize a given text in a few sentences or paragraphs.
Translate: A prompt that tells the model to translate a given text from one language to another.
Write: A prompt that tells the model to write a text of a given genre, style, or topic, such as a poem, a story, a code, or an essay.
Answer: A prompt that tells the model to answer a given question in a natural language or a numerical form.

The system prompts can also be customized and combined to create more complex and specific tasks, such as:

Summarize and translate: A prompt that tells the model to summarize a given text in one language and then translate it to another language.
Write and answer: A prompt that tells the model to write a text of a given genre, style, or topic, and then answer a question about it.
Answer and write: A prompt that tells the model to answer a given question in a natural language or a numerical form, and then write a text of a given genre, style, or topic based on the answer.

Conclusion

Anthropic’s Claude 2.1 is a remarkable achievement in the field of AI, as it surpasses OpenAI’s GPT-4 Turbo in context length, allowing it to process and generate longer texts with more coherence and accuracy. The new model also has improved capabilities, increased accuracy, and enhanced system prompts, making it more versatile, trustworthy, and user-friendly. Claude 2.1 is a testament to Anthropic’s success and growth in the AI industry, as it competes with OpenAI and raises billions in funding. Claude 2.1 is accessible via an API, which allows developers to integrate the model into their applications and products, and to perform a wide range of tasks with natural language instructions.