At Google's Mountain View headquarters this week, a man dressed in a rainbow-hued dressing gown emerged from a giant coffee cup to give a vibrant, if somewhat surreal, demonstration of the company's latest achievements in generative AI.

Iran Press/America: At the I/O event, electronic musician and YouTuber Marc Rebillet tinkered with an AI music tool that can generate synced tracks based on prompts like “viola” and “808 hip-hop beat”.

The AI, he told developers, came up with ways to “fill in the sparser elements of my loops . . . It’s like having this weird friend that’s just like ‘try this, try that’.” What Rebillet was describing is an AI assistant, a personalised bot that is supposed to help you work, create or communicate better, and interface with the digital world on your behalf, Financial Times reported.

This new class of products has stolen the limelight this week among a flurry of new AI developments from Google and its AI division DeepMind, as well as Microsoft-backed OpenAI. The companies simultaneously announced a series of upgraded AI tools that are “multimodal”, which means they can interpret voice, video, images and code in a single interface, and also carry out complex tasks like live translations or planning a family holiday.

In a video demonstration, Google’s prototype AI assistant Astra, powered by its Gemini model, responded to voice commands based on an analysis of what it sees through a phone camera or when using a pair of smart glasses. It successfully identified sequences of code, suggested improvements to electrical circuit diagrams, recognized the King’s Cross area of London through the camera lens, and reminded the user where they had left their glasses.

Meanwhile, at OpenAI’s product launch on Monday, chief technology officer Mira Murati and her colleagues demonstrated how their new AI model, GPT4o, can perform voice translation in live conversation, and similarly interact with the user using an anthropomorphised tone and voice to parse text, images, video and code.

“This is incredibly important because we’re looking at the future of interaction between ourselves and the machines,” Murati tells the FT. While smart assistants powered by AI have been in train for nearly a decade, these latest advances allow for smoother and more rapid voice interactions, and superior levels of understanding thanks to the large language models (LLMs) that power new AI models.

Now, a fresh scramble is under way among tech groups to bring so-called AI agents out to consumers. These are best understood as “intelligent systems”, said Google chief executive Sundar Pichai this week, “that show reasoning, planning and memory, are able to ‘think’ multiple steps ahead, and work across software and systems, all to get something done on your behalf”.

As well as Google and OpenAI, Apple is expected to be a major player in this race. Industry insiders anticipate that a significant upgrade to Apple’s voice assistant, Siri, is on the horizon, as the company rolls out new AI chips, designed in-house and capable of powering generative models on-device.

Meta, meanwhile, has already launched an AI assistant on its platforms Facebook, Instagram and WhatsApp across more than a dozen countries in April. Start-ups like Rabbit and Humane are also attempting to enter the space by designing products that act as standalone AI helpers. Although analysts point out that this week’s big announcements remained largely “vapourware” — concepts rather than real products — it is clear to industry watchers that AI assistants or agents will be key to bringing the latest AI technology to the masses.


Read More:

OpenAI GPT-4o is out: Benefits and how to access