Better Siri is coming: what Apple’s research says about its AI plans

android, better siri is coming: what apple’s research says about its ai plans

Better Siri is coming: what Apple’s research says about its AI plans

It would be easy to think that Apple is late to the game on AI. Since late 2022, when ChatGPT took the world by storm, most of Apple’s competitors have fallen over themselves to catch up. While Apple has certainly talked about AI and even released some products with AI in mind, it seemed to be dipping a toe in rather than diving in headfirst.

But over the last few months, rumors and reports have suggested that Apple has, in fact, just been biding its time, waiting to make its move. There have been reports in recent weeks that Apple is talking to both OpenAI and Google about powering some of its AI features, and the company has also been working on its own model, called Ajax.

If you look through Apple’s published AI research, a picture starts to develop of how Apple’s approach to AI might come to life. Now, obviously, making product assumptions based on research papers is a deeply inexact science — the line from research to store shelves is windy and full of potholes. But you can at least get a sense of what the company is thinking about — and how its AI features might work when Apple starts to talk about them at its annual developer conference, WWDC, in June.

Smaller, more efficient models

I suspect you and I are hoping for the same thing here: Better Siri. And it looks very much like Better Siri is coming! There’s an assumption in a lot of Apple’s research (and in a lot of the tech industry, the world, and everywhere) that large language models will immediately make virtual assistants better and smarter. For Apple, getting to Better Siri means making those models as fast as possible — and making sure they’re everywhere.

In iOS 18, Apple plans to have all its AI features running on an on-device, fully offline model, Bloomberg recently reported. It’s tough to build a good multipurpose model even when you have a network of data centers and thousands of state-of-the-art GPUs — it’s drastically harder to do it with only the guts inside your smartphone. So Apple’s having to get creative.

In a paper called “LLM in a flash: Efficient Large Language Model Inference with Limited Memory” (all these papers have really boring titles but are really interesting, I promise!), researchers devised a system for storing a model’s data, which is usually stored on your device’s RAM, on the SSD instead. “We have demonstrated the ability to run LLMs up to twice the size of available DRAM [on the SSD],” the researchers wrote, “achieving an acceleration in inference speed by 4-5x compared to traditional loading methods in CPU, and 20-25x in GPU.” By taking advantage of the most inexpensive and available storage on your device, they found, the models can run faster and more efficiently.

Apple’s researchers also created a system called EELBERT that can essentially compress an LLM into a much smaller size without making it meaningfully worse. Their compressed take on Google’s Bert model was 15 times smaller — only 1.2 megabytes — and saw only a 4 percent reduction in quality. It did come with some latency tradeoffs, though.

In general, Apple is pushing to solve a core tension in the model world: the bigger a model gets, the better and more useful it can be, but also the more unwieldy, power-hungry, and slow it can become. Like so many others, the company is trying to find the right balance between all those things while also looking for a way to have it all.

Siri, but good

A lot of what we talk about when we talk about AI products is virtual assistants — assistants that know things, that can remind us of things, that can answer questions, and get stuff done on our behalf. So it’s not exactly shocking that a lot of Apple’s AI research boils down to a single question: what if Siri was really, really, really good?

A group of Apple researchers has been working on a way to use Siri without needing to use a wake word at all; instead of listening for “Hey Siri” or “Siri,” the device might be able to simply intuit whether you’re talking to it. “This problem is significantly more challenging than voice trigger detection,” the researchers did acknowledge, “since there might not be a leading trigger phrase that marks the beginning of a voice command.” That might be why another group of researchers developed a system to more accurately detect wake words. Another paper trained a model to better understand rare words, which are often not well understood by assistants.

In both cases, the appeal of an LLM is that it can, in theory, process much more information much more quickly. In the wake-word paper, for instance, the researchers found that by not trying to discard all unnecessary sound but, instead, feeding it all to the model and letting it process what does and doesn’t matter, the wake word worked far more reliably.

Once Siri hears you, Apple’s doing a bunch of work to make sure it understands and communicates better. In one paper, it developed a system called STEER (which stands for Semantic Turn Extension-Expansion Recognition, so we’ll go with STEER) that aims to improve your back-and-forth communication with an assistant by trying to figure out when you’re asking a follow-up question and when you’re asking a new one. In another, it uses LLMs to better understand “ambiguous queries” to figure out what you mean no matter how you say it. “In uncertain circumstances,” they wrote, “intelligent conversational agents may need to take the initiative to reduce their uncertainty by asking good questions proactively, thereby solving problems more effectively.” Another paper aims to help with that, too: researchers used LLMs to make assistants less verbose and more understandable when they’re generating answers.

android, better siri is coming: what apple’s research says about its ai plans

Pretty soon, you might be able to edit your pictures just by asking for the changes.

AI in health, image editors, in your Memojis

Whenever Apple does talk publicly about AI, it tends to focus less on raw technological might and more on the day-to-day stuff AI can actually do for you. So, while there’s a lot of focus on Siri — especially as Apple looks to compete with devices like the Humane AI Pin, the Rabbit R1, and Google’s ongoing smashing of Gemini into all of Android — there are plenty of other ways Apple seems to see AI being useful.

One obvious place for Apple to focus is on health: LLMs could, in theory, help wade through the oceans of biometric data collected by your various devices and help you make sense of it all. So, Apple has been researching how to collect and collate all of your motion data, how to use gait recognition and your headphones to identify you, and how to track and understand your heart rate data. Apple also created and released “the largest multi-device multi-location sensor-based human activity dataset” available after collecting data from 50 participants with multiple on-body sensors.

Apple also seems to imagine AI as a creative tool. For one paper, researchers interviewed a bunch of animators, designers, and engineers and built a system called Keyframer that “enable[s] users to iteratively construct and refine generated designs.” Instead of typing in a prompt and getting an image, then typing another prompt to get another image, you start with a prompt but then get a toolkit to tweak and refine parts of the image to your liking. You could imagine this kind of back-and-forth artistic process showing up anywhere from the Memoji creator to some of Apple’s more professional artistic tools.

In another paper, Apple describes a tool called MGIE that lets you edit an image just by describing the edits you want to make. (“Make the sky more blue,” “make my face less weird,” “add some rocks,” that sort of thing.) “Instead of brief but ambiguous guidance, MGIE derives explicit visual-aware intention and leads to reasonable image editing,” the researchers wrote. Its initial experiments weren’t perfect, but they were impressive.

We might even get some AI in Apple Music: for a paper called “Resource-constrained Stereo Singing Voice Cancellation,” researchers explored ways to separate voices from instruments in songs — which could come in handy if Apple wants to give people tools to, say, remix songs the way you can on TikTok or Instagram.

android, better siri is coming: what apple’s research says about its ai plans

In the future, Siri might be able to understand and use your phone for you.

Over time, I’d bet this is the kind of stuff you’ll see Apple lean into, especially on iOS. Some of it Apple will build into its own apps; some it will offer to third-party developers as APIs. (The recent Journaling Suggestions feature is probably a good guide to how that might work.) Apple has always trumpeted its hardware capabilities, particularly compared to your average Android device; pairing all that horsepower with on-device, privacy-focused AI could be a big differentiator.

But if you want to see the biggest, most ambitious AI thing going at Apple, you need to know about Ferret. Ferret is a multi-modal large language model that can take instructions, focus on something specific you’ve circled or otherwise selected, and understand the world around it. It’s designed for the now-normal AI use case of asking a device about the world around you, but it might also be able to understand what’s on your screen. In the Ferret paper, researchers show that it could help you navigate apps, answer questions about App Store ratings, describe what you’re looking at, and more. This has really exciting implications for accessibility but could also completely change the way you use your phone — and your Vision Pro and / or smart glasses someday.

We’re getting way ahead of ourselves here, but you can imagine how this would work with some of the other stuff Apple is working on. A Siri that can understand what you want, paired with a device that can see and understand everything that’s happening on your display, is a phone that can literally use itself. Apple wouldn’t need deep integrations with everything; it could simply run the apps and tap the right buttons automatically.

Again, all this is just research, and for all of it to work well starting this spring would be a legitimately unheard-of technical achievement. (I mean, you’ve tried chatbots — you know they’re not great.) But I’d bet you anything we’re going to get some big AI announcements at WWDC. Apple CEO Tim Cook even teased as much in February, and basically promised it on this week’s earnings call. And two things are very clear: Apple is very much in the AI race, and it might amount to a total overhaul of the iPhone. Heck, you might even start willingly using Siri! And that would be quite the accomplishment.

OTHER NEWS

13 minutes ago

Debate rages as Ramaphosa’s NHI Act divides SA

13 minutes ago

Five things to know about Biden’s controversial retirement rule

13 minutes ago

Aussie restores order, wins back belt in Saudi slugfest

13 minutes ago

Chris Simms says feud between Sean Payton and Russell Wilson unresolved

13 minutes ago

Rangers vs. Panthers: Breaking down 2023-24 season series ahead of playoff matchup

13 minutes ago

Emma Stone Wears Louis Vuitton Minidress With Statement Belt to Cannes Film Festival

13 minutes ago

Premiership semi-finals and kick-off times confirmed

16 minutes ago

London Knights championship celebrations draw thousands to Budweiser Gardens

16 minutes ago

London Drugs says employee information could be ‘compromised’ in cyberattack

16 minutes ago

Valerie Bertinelli Announces Mental Health Break From Social Media

18 minutes ago

Jurgen Klopp pens emotional open letter to Liverpool fans ahead of final game

18 minutes ago

Judge pushes decision to next week on Alec Baldwin's indictment in fatal 2021 shooting

18 minutes ago

Robbie Keane guides Maccabi Tel-Aviv to Israeli league title after first season in charge

18 minutes ago

How Neil Jason put on a slap bass masterclass on David Sanborn’s 1979 instrumental classic Hideaway

18 minutes ago

Trump trial judge rebuked for donations to Democrat-aligned groups in 2020

18 minutes ago

Tornadoes, floods, heatwaves and freezes - the extreme weather around the world this weekend

18 minutes ago

Spector unveils Doug Wimbish USA Custom Series basses, including a replica of his iconic 1987 5-string

18 minutes ago

The Beach Boys seen at Abbey Road studios promoting new documentary

18 minutes ago

If you've somehow not played Baldur's Gate 3 yet, our 2023 GOTY is on sale for 15% off, the lowest it's been since launch

18 minutes ago

UAW loses vote to unionize Mercedes-Benz plant in Alabama

18 minutes ago

Highlights! 19-Year-Old Phenom Delivers Vicious TKO

18 minutes ago

'Bet on Cowboys'? Osa Talks Locker Room, Contract and 'All In'

18 minutes ago

Eagles Schedule: Ranking Top QBs Philly's Defense Will Face in 2024

18 minutes ago

William to be usher at Duke of Westminster’s wedding, report says

19 minutes ago

Acacia Ridge, Brisbane: Teenager is charged over horrific stabbing near a school - as a 19-year-old fights for life in hospital

19 minutes ago

Bo Nix starting to build outstanding relationships in Denver, especially with the media

19 minutes ago

Tennis-Swiatek brushes Sabalenka aside to win third Italian Open title

19 minutes ago

Uddhav Thackeray holds 4 Mumbai rallies on last day, calls MNS chief Raj 'mercenary'

19 minutes ago

Putin's Ultimatum to the UK

19 minutes ago

Why Biden could LOSE key battleground state he flipped in 2020 due to rapidly changing landscape

19 minutes ago

Body of Israeli hostage Ron Benjamin is recovered in Gaza after he was kidnapped by Hamas on October 7 while on a bike ride

20 minutes ago

Taking the Lead: Phase 4 of 2024 Polls Saw Improved Turnout of Female Voters in 53 Seats from 2019

20 minutes ago

Sabres' Bowen Byram in Hot Water

24 minutes ago

Emma Raducanu slams tennis gender pay gap and claims women are 'technically better'

24 minutes ago

Favorites stage 15 Giro d'Italia 2024 | Pink Pogacar faces Mortirolo and Livigno over Pentecost!

24 minutes ago

Happy Valley, Top Boy and The Sixth Commandment celebrate wins at Bafta TV awards

24 minutes ago

‘We are happy to welcome aboard millions more Tories’

24 minutes ago

Arne Slot confirms football’s worst-kept secret: he will be next Liverpool manager

24 minutes ago

Kaizer Chiefs players feeling pressure ahead of Polokwane City meeting

25 minutes ago

Biden campaign rejects further debates put forward by Donald Trump

Kênh khám phá trải nghiệm của giới trẻ, thế giới du lịch