OpenAI’s Sora text-to-video AI model may be spreading across the internet like wildfire, but it’s just not OpenAI that has announced a major development in AI. Days after rebranding its AI chatbot as Google Gemini, Alphabet Inc. is back with a major announcement. Google has unveiled its latest next-generation AI model, the Gemini 1.5 Pro. The new model is built on MoE architecture and is claimed to be far more advanced than its contemporaries.
When it comes to Gemini 1.5 Pro, Google seems to have brought out a model that is superior and remarkably ahead of its predecessors. Gemini 1.5 Pro is the first model in the Gemini 1.5 line that the company is releasing for early testing. The 1.5 Pro is a mid-size multimodal model that has been optimised for scaling across a wide range of tasks. Here, we try to understand what’s new with Gemini 1.5 Pro.
What is the Gemini 1.5 Pro?
What stands out about the Gemini 1.5 Pro is its long-context understanding across modalities. Google claims that the Gemini 1.5 Pro is capable of achieving similar results as the recently launched Gemini 1.0 Ultra, albeit with much less computing power. And, the most outstanding aspect of the Gemini 1.5 Pro is its ability to process the amount of information by up to one million tokens consistently. This is certainly the longest context window for any large-scale foundation model developed yet. To put into perspective, the Gemini 1.0 models have a context window of up to 32,000 tokens, GPT-4 Turbo has 1,28,000 tokens and Claude 2.1 has 2,00,000 tokens.
While the model comes with a standard 1,28,000 token context window, Google is allowing a limited number of developers and enterprise customers to try it with a context window of up to one million tokens. The Gemini 1.5 Pro is currently in preview mode and developers can test the model using Google’s AI Studio and Vertex AI.
Google has claimed that since it launched the Gemini 1.0, they have been consistently testing, refining, and enhancing its capabilities — and 1.5 Pro is an outcome of its efforts. When it comes to the underlying technology, the 1.5 Pro is built upon Mixture-of-Experts (MoE) architecture. The MoE architecture can be understood as a collective approach where the whole problem is divided into numerous sub-tasks that are later trained by a cluster of experts on each sub-task. In essence, the MoE model covers different input data with different learners or experts.
This is a step change in Google’s approach which builds upon research and engineering innovations across nearly every part of its foundational model development and infrastructure. Google claims the new MoE architecture makes the Gemini 1.5 Pro more efficient to train and serve.
What are the use cases of the Gemini 1.5 Pro?
The Gemini 1.5 Pro can reportedly ingest up to 7,00,000 words or about 30,000 lines of code. This is 35 times more than what Gemini 1.0 Pro can take in. Besides, the Gemini 1.5 Pro can process up to 11 hours of audio and 1 hour of video in a wide range of languages. The demo videos posted on Google’s official YouTube channel showed the long context understanding of the model by using a 402-page-long PDF. The demo also showed a live interaction with the model based on the PDF file as prompt, which was 3,26,658 tokens and had 256 tokens worth of images. The demo used a total of 3,27,309 tokens.
Another demo showed the Gemini 1.5 Pro using a 44-minute video, a recording of the silent film Sherlock Jr. along with a host of multimodal prompts. The total tokens stood at 6,96,161 for the video, and images stood at 256 tokens. In the demo, a user is seen asking the model to show specific moments and related information in the video. The model responds with the timestamps of the moment and details as they appear in the video.
Meanwhile, another demo showcased how the model interacted with 100,633 lines of code with a series of multimodal prompts.
What is the pricing and when will it be available?
Reportedly, in a preview, Google said that the Gemini 1.5 Pro with a 1 million-token context window will be free to use. Google may introduce pricing tiers in the future on the model that starts at 1,28,000 context windows and will scale up to 1 million tokens.
Gemini 1.5 Pro is a new frontier in Google’s AI developments. In December last year, Google introduced its most flexible AI model Gemini 1.0 in three different sizes, including Gemini Ultra, Gemini Pro, and Gemini Nano. At the time of launch, Google claimed that its Gemini 1.0 surpassed several state-of-the-art performances on a range of benchmarks including coding and text. The Gemini series has been known for its next-generation capabilities and sophisticated reasoning. All Gemini sizes have been known for their multimodality — the ability to understand text, images, audio and more.
For the latest news from across India, Political updates, Explainers, Sports News, Opinion, Entertainment Updates and more Top News, visit Indian Express. Subscribe to our award-winning Newsletter Download our App here Android & iOS
News Related-
Anurag Kashyap unveils teaser of ‘Kastoori’
-
Shehar Lakhot: Meet The Intriguing Characters Of The Upcoming Noir Crime Drama
-
Watch: 'My name is VVS Laxman...': When Ishan Kishan gave wrong answers to right questions
-
Tennis-Sabalenka, Rybakina to open new season in Brisbane
-
Sikandar Raza Makes History For Zimbabwe With Hattrick A Day After Punjab Kings Retain Him- WATCH
-
Delayed Barapullah work yet to begin despite land transfer
-
Army called in to help in tunnel rescue operation
-
FIR against Redbird aviation school for non-cooperation, obstructing DGCA officials in probe
-
IPL 2024 Auction: Why Gujarat Titans allowed Hardik Pandya to join Mumbai Indians? GT explain
-
From puff sleeves to sustainable designs: Top 5 bridal fashion trends redefining elegance and style for brides-to-be
-
The Judge behind China's financial reckoning
-
Arshdeep Singh & Axar Patel Out, Avesh Khan & Washington Sundar IN? India's Likely Playing XI For 3rd T20I
-
Horoscope Today, November 28, 2023: Check here Astrological prediction for all zodiac signs
-
'Gurdwaras are...': US Sikh body on Indian envoy's heckling by Khalistani backers