Loading...

25 Dec 2024 23:56

Mobile & Digital

Google I/O 2023: Making AI more helpful for everyone

Editor’s Note: Here is a summary of what we announced at Google I/O 2023. See all the announcements in our collection.

Seven years into our journey as an AI-first company, we’re at an exciting inflection point. We have an opportunity to make AI even more helpful for people, for businesses, for communities, for everyone.

We’ve been applying AI to make our products radically more helpful for a while. With generative AI, we’re taking the next step. With a bold and responsible approach, we’re reimagining all our core products, including Search.

AI in our products

“Help me write” in Gmail

There are some great examples of how generative AI is helping to evolve our products, starting with Gmail. In 2017, we launched Smart Reply, short responses you could select with just one click. Next came Smart Compose, which offered writing suggestions as you type. Smart Compose led to more advanced writing features powered by AI. They’ve been used in Workspace over 180 billion times in the past year alone. And now, with a much more powerful generative model, we’re taking the next step in Gmail with “Help me write.”

Let’s say you got an email that your flight was canceled. The airline has sent a voucher, but what you really want is a full refund. You could reply, and use “Help me write.”

Just type in the prompt of what you want — an email that asks for a full refund — hit create, and a full draft appears. It conveniently pulls in flight details from the previous email. It looks pretty close to what you want to send, but maybe you want to refine it further. In this case, a more elaborate email might increase the chances of getting the refund. “Help me write” will start rolling out as part of our Workspace updates. And just like with Smart Compose, you’ll see it get better over time.

New Immersive View for routes in Maps

Since the early days of Street View, AI has stitched together billions of panoramic images, so people can explore the world from their device. At last year’s I/O we introduced Immersive View, which uses AI to create a high-fidelity representation of a place, so you can experience it before you visit.

Now, we’re expanding that same technology to do what Maps does best: help you get where you want to go. Google Maps provides 20 billion kilometers of directions, every day — that’s a lot of trips. Now imagine if you could see your whole trip in advance. With Immersive View for routes you can, whether you’re walking, cycling or driving.

Say you’re in New York City and you want to go on a bike ride. Maps has given you a couple of options close to where you are. The one on the waterfront looks scenic, but you want to get a feel for it first, so you click on Immersive View for routes. It’s an entirely new way to look at your journey. You can zoom in to get an incredible bird’s eye view of the ride.

There’s more information available too. You can check air quality, traffic and weather, and see how they might change.

Immersive View for routes will begin to roll out over the summer, and launch in 15 cities by the end of the year, including London, New York, Tokyo and San Francisco.

A New York City bike ride in the new Immersive View for routes

A new Magic Editor experience in Photos

Another product made better by AI is Google Photos. We introduced it at I/O in 2015, and it was one of our first AI-native products. Breakthroughs in machine learning made it possible to search your photos for things like people, sunsets or waterfalls.

Of course, we want you to do more than just search photos — we also want to help you make them better. In fact, every month, 1.7 billion images are edited in Google Photos. AI advancements give us more powerful ways to do this. For example, Magic Eraser, launched first on Pixel, uses AI-powered computational photography to remove unwanted distractions. And later this year, using a combination of semantic understanding and generative AI, you can do much more with a new experience called Magic Editor.

Here’s an example: This is a great photo, but as a parent, you probably want your kid at the center of it all. And it looks like the balloons got cut off in this one, so you can go ahead and reposition the birthday boy. Magic Editor automatically recreates parts of the bench and balloons that were not captured in the original shot. As a finishing touch, you can punch up the sky. This also changes the lighting in the rest of the photo so the edit feels consistent. It’s truly magical. We’re excited to roll out Magic Editor in Google Photos later this year.

An animation of a toddler-aged boy on a bench on a rooftop, holding a bunch of colorful balloons, with a city skyline in the background. In the shot on the right the boy is centered, you see more of the bench and balloons, and the sky is clearer.
Caption: A photo transformed by Magic Editor in Google Photos

Making AI more helpful for everyone

From Gmail and Photos to Maps, these are just a few examples of how AI can help you in moments that matter. And there’s so much more we can do to deliver the full potential of AI across the products you know and love.

Today, we have 15 products that each serve more than half a billion people and businesses. And six of those products serve over 2 billion users each. This gives us so many opportunities to deliver on our mission — to organize the world’s information and make it universally accessible and useful.

It’s a timeless mission that feels more relevant with each passing year. And looking ahead, making AI helpful for everyone is the most profound way we’ll advance our mission.

We’re doing this in four important ways:

First, by improving your knowledge and learning, and deepening your understanding of the world.
Second, by boosting creativity and productivity, so you can express yourself and get things done.
Third, by enabling developers and businesses to build their own transformative products and services.
And finally, by building and deploying AI responsibly, so that everyone can benefit equally.

PaLM 2 and Gemini

We are so excited by the opportunities ahead. Our ability to make AI helpful for everyone relies on continuously advancing our foundation models. So I want to take a moment to share how we’re approaching them.

Last year you heard us talk about PaLM, which led to many improvements across our products. Today, we’re ready to announce our latest PaLM model in production: PaLM 2.

PaLM 2 builds on our fundamental research and our latest infrastructure. It’s highly capable at a wide range of tasks and easy to deploy. We are announcing more than 25 products and features powered by PaLM 2 today.

PaLM 2 models deliver excellent foundational capabilities across a wide range of sizes. We’ve affectionately named them Gecko, Otter, Bison, and Unicorn. Gecko is so lightweight that it can work on mobile devices: fast enough for great interactive applications on-device, even when offline. PaLM 2 models are stronger in logic and reasoning thanks to broad training on scientific and mathematical topics. It’s also trained on multilingual text — spanning more than 100 languages — so it understands and generates nuanced results.

Combined with powerful coding capabilities, PaLM 2 can also help developers collaborating around the world. Let’s say you’re working with a colleague in Seoul and you’re debugging code. You can ask it to fix a bug and help out your teammate by adding comments in Korean to the code. It first recognizes the code is recursive, then suggests a fix. It explains the reasoning behind the fix, and it adds Korean comments like you asked.

Introducing PaLM 2

Today at I/O 2023, Google introduced PaLM 2, a new language model with improved multilingual, reasoning, and coding capabilities.

While PaLM 2 is highly capable, it really shines when fine-tuned on domain-specific knowledge. We recently released Sec-PaLM, fine-tuned for security use cases. It uses AI to better detect malicious scripts, and it can help security experts understand and resolve threats.

Another example is Med-PaLM 2. In this case, it’s fine-tuned on medical knowledge. This fine-tuning achieved a 9X reduction in inaccurate reasoning when compared to the base model, approaching the performance of clinician experts who answered the same set of questions. In fact, Med-PaLM 2 was the first language model to perform at “expert” level on medical licensing exam-style questions, and is currently the state of the art.

We’re also working to add capabilities to Med-PaLM 2, so that it can synthesize information from medical imaging like plain films and mammograms. You can imagine an AI collaborator that helps radiologists interpret images and communicate the results. These are some examples of PaLM 2 being used in specialized domains. We can’t wait to see it used in more, which is why I’m pleased to announce that PaLM 2 is now available in preview.

PaLM 2 is the latest step in our decade-long journey to bring AI in responsible ways to billions of people. It builds on progress made by two world-class research teams, the Brain Team and DeepMind.

Looking back at the defining AI breakthroughs over the last decade, these teams have contributed to a significant number of them: AlphaGo,Transformers, sequence-to-sequence models, and so on. All this helped set the stage for the inflection point we’re at today.

We recently brought these two teams together into a single unit, Google DeepMind. Using the computational resources of Google, they’re focused on building more capable systems, safely and responsibly.

This includes our next-generation foundation model, Gemini, which is still in training. Gemini was created from the ground up to be multimodal, highly efficient at tool and API integrations and built to enable future innovations, like memory and planning. While still early, we’re already seeing impressive multimodal capabilities not seen in prior models.

Once fine-tuned and rigorously tested for safety, Gemini will be available at various sizes and capabilities, just like PaLM 2.

AI responsibility: Tools to identify generated content

As we invest in more capable models, we are also deeply investing in AI responsibility. That includes having the tools to identify synthetically generated content whenever you encounter it.

Two important approaches are watermarking and metadata. Watermarking embeds information directly into content in ways that are maintained even through modest image editing. Moving forward, we’re building our models to include watermarking and other techniques from the start. If you look at a synthetic image, it’s impressive how real it looks, so you can imagine how important this is going to be in the future.

Metadata allows content creators to associate additional context with original files, giving you more information whenever you encounter an image. We’ll ensure every one of our AI-generated images has that metadata. .

Updates to Bard and Workspace

As models get better and more capable, one of the most exciting opportunities is making them available for people to engage with directly.

That’s the opportunity we have with Bard, our experiment for conversational AI, which we launched in March. We’ve been rapidly evolving Bard. It now supports a wide range of programming capabilities, and it’s gotten much smarter at reasoning and math prompts. And, as of today, it is now fully running on PaLM 2

Sourec: Google

(Visited 1 times, 1 visits today)
Top