AI News – April 8.
Google’s AI mode rolls out to more users. Amazon launches a new speech to speech model and Microsoft trains its gaming platform on Quake II Here’s today’s AI news. Google launched ‘AI mode’ as a Google Labs experiment around a month ago and today they announced that it’s rolling out to even more US users.
[00:17.8]
AI mode allows the user to choose to move into an interface that is a hybrid of search and AI output. Google has also added new multimodal understanding, so you can take a photo or upload an image, ask a question about it and get a rich, comprehensive response with links to dive deeper.
[00:34.2]
With Gemini generating the AI results and Google’s search pedigree, this feels like a very powerful combination and it seems to represent Google’s vision for the near future of search. Amazon has launched a new speech to speech model called ‘Nova Sonic’ It’s designed to understand what people are saying and how they are saying it.
[00:51.1]
It works with tone style and conversation flow, including pauses and interruptions. This model will allow developers to develop voice applications that maintain important context and nuance, and is aimed predominantly at creating a range of more natural sounding customer service tools.
[01:08.3]
And finally, Microsoft has made some major advances with its Muse AI gaming platform and specifically with their World and Human Action Model, or WHAM. WHAM has received several enhancements. It’s faster, It can now produce an output of 10 frames per second.
[01:23.5]
Previously it was only around a single frame. It’s much quicker to train with some clever curation of the data. This latest version of the model has been training on Quake II with only seven days of training data compared to the seven years that the previous model required. And finally, the resolution has doubled.
[01:39.0]
It’s now outputting visuals at 640×360 compared to the previous 300×180. The whole update represents a remarkable improvement in a very short amount of time. In the video, you’re seeing an entire game world and its mechanics being generated in real time and it’s getting very close to being in a playable state.