Gemini Pro 1.5 with 1 million tokens surpasses GPT-4 Turbo: What does that mean?

android, gemini pro 1.5 with 1 million tokens surpasses gpt-4 turbo: what does that mean?

OpenAI’s Sora text-to-video AI model may be spreading across the internet like wildfire, but it’s just not OpenAI that has announced a major development in AI. Days after rebranding its AI chatbot as Google Gemini, Alphabet Inc. is back with a major announcement. Google has unveiled its latest next-generation AI model, the Gemini 1.5 Pro. The new model is built on MoE architecture and is claimed to be far more advanced than its contemporaries.

When it comes to Gemini 1.5 Pro, Google seems to have brought out a model that is superior and remarkably ahead of its predecessors. Gemini 1.5 Pro is the first model in the Gemini 1.5 line that the company is releasing for early testing. The 1.5 Pro is a mid-size multimodal model that has been optimised for scaling across a wide range of tasks. Here, we try to understand what’s new with Gemini 1.5 Pro.

What is the Gemini 1.5 Pro?

What stands out about the Gemini 1.5 Pro is its long-context understanding across modalities. Google claims that the Gemini 1.5 Pro is capable of achieving similar results as the recently launched Gemini 1.0 Ultra, albeit with much less computing power. And, the most outstanding aspect of the Gemini 1.5 Pro is its ability to process the amount of information by up to one million tokens consistently. This is certainly the longest context window for any large-scale foundation model developed yet. To put into perspective, the Gemini 1.0 models have a context window of up to 32,000 tokens, GPT-4 Turbo has 1,28,000 tokens and Claude 2.1 has 2,00,000 tokens.

While the model comes with a standard 1,28,000 token context window, Google is allowing a limited number of developers and enterprise customers to try it with a context window of up to one million tokens. The Gemini 1.5 Pro is currently in preview mode and developers can test the model using Google’s AI Studio and Vertex AI.

Google has claimed that since it launched the Gemini 1.0, they have been consistently testing, refining, and enhancing its capabilities — and 1.5 Pro is an outcome of its efforts. When it comes to the underlying technology, the 1.5 Pro is built upon Mixture-of-Experts (MoE) architecture. The MoE architecture can be understood as a collective approach where the whole problem is divided into numerous sub-tasks that are later trained by a cluster of experts on each sub-task. In essence, the MoE model covers different input data with different learners or experts.

This is a step change in Google’s approach which builds upon research and engineering innovations across nearly every part of its foundational model development and infrastructure. Google claims the new MoE architecture makes the Gemini 1.5 Pro more efficient to train and serve.

What are the use cases of the Gemini 1.5 Pro?

The Gemini 1.5 Pro can reportedly ingest up to 7,00,000 words or about 30,000 lines of code. This is 35 times more than what Gemini 1.0 Pro can take in. Besides, the Gemini 1.5 Pro can process up to 11 hours of audio and 1 hour of video in a wide range of languages. The demo videos posted on Google’s official YouTube channel showed the long context understanding of the model by using a 402-page-long PDF. The demo also showed a live interaction with the model based on the PDF file as prompt, which was 3,26,658 tokens and had 256 tokens worth of images. The demo used a total of 3,27,309 tokens.

Another demo showed the Gemini 1.5 Pro using a 44-minute video, a recording of the silent film Sherlock Jr. along with a host of multimodal prompts. The total tokens stood at 6,96,161 for the video, and images stood at 256 tokens. In the demo, a user is seen asking the model to show specific moments and related information in the video. The model responds with the timestamps of the moment and details as they appear in the video.

Meanwhile, another demo showcased how the model interacted with 100,633 lines of code with a series of multimodal prompts.

What is the pricing and when will it be available?

Reportedly, in a preview, Google said that the Gemini 1.5 Pro with a 1 million-token context window will be free to use. Google may introduce pricing tiers in the future on the model that starts at 1,28,000 context windows and will scale up to 1 million tokens.

Gemini 1.5 Pro is a new frontier in Google’s AI developments. In December last year, Google introduced its most flexible AI model Gemini 1.0 in three different sizes, including Gemini Ultra, Gemini Pro, and Gemini Nano. At the time of launch, Google claimed that its Gemini 1.0 surpassed several state-of-the-art performances on a range of benchmarks including coding and text. The Gemini series has been known for its next-generation capabilities and sophisticated reasoning. All Gemini sizes have been known for their multimodality — the ability to understand text, images, audio and more.

For the latest news from across India, Political updates, Explainers, Sports News, Opinion, Entertainment Updates and more Top News, visit Indian Express. Subscribe to our award-winning Newsletter Download our App here Android & iOS

News Related

OTHER NEWS

Guru Nanak Jayanti: Rishi Sunak Highlights Punjabi Heritage In Message, Trudeau Extends Greetings

In a greeting from 10 Downing Street on the occasion of Guru Nanak Jayanti, British Prime Minister Rishi Sunak mentioned his Punjabi Indian origin, news agency PTI reported. The 43-year-old ... Read more »

What US easing sanctions on Venezuela, home to world’s largest oil reserves, could mean for India

This report is the second of a three-part series on recent Indian engagement in the Latin American and Caribbean (LAC) region. New Delhi: The US’ decision last month to ease ... Read more »

Rajshri Deshpande dedicates OTT award to innocent lives lost in Gaza, Palestine

Rajshri Deshpande dedicates OTT award to innocent lives lost in Gaza, Palestine Actor and social worker Rajshri Deshpande won the Best Actor, Series (Female) award for Netflix’s ‘Trial By Fire’. ... Read more »

‘Ramchandra Keh Gaye…’: From Jan 1, RSS to Spread Word of God, Ayodhya Inauguration Among 10 Crore People

‘Ramchandra Keh Gaye…’: From Jan 1, RSS to Spread Word of God, Ayodhya Inauguration Among 10 Crore People In its effort to take the Ram Janmabhoomi message to households across ... Read more »

Ace designer Rohit Bal critical, on ventilator: report

Ace designer Rohit Bal critical, on ventilator: report Celebrated fashion designer Rohit Bal is in critical condition and is on ventilator support, HT City reported, quoting sources. He has been ... Read more »

Bengaluru: Traffic Advisory Issued, Parking Restrictions In Place As Samyukta Horata Samiti Holds Protest | Details

Bengaluru: Traffic Advisory Issued, Parking Restrictions In Place As Samyukta Horata Samiti Holds Protest | Details The Bengaluru Traffic Police has issued a traffic advisory for November 27 and 28 ... Read more »

Vistara Flights Diverted Due To Air Congestion At Delhi Airport | DETAILS

vistara flights diverted due to air congestion at delhi airport | details Delhi: Two Vistara flight has been diverted to Lucknow and Jaipur due to bad weather and air congestion ... Read more »
Top List in the World