Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Google Garners Criticism for Demo After Long Awaited ‘Gemini’ Release


Shortly after news spread that Google was delaying the release of its long-awaited AI model called Gemini, Google announced its launch.

google releases gemini and receives criticism for demo

As part of the release, they’ve released a demo that showcases the impressive – downright amazing – capabilities of Gemini. Well, you know what they say about things being too good to be true.

Let’s explore what went wrong with the demo and how it compares to OpenAI.

Click here to subscribe to HubSpot's AI newsletter

What is Google Gemini?

Rivaling OpenAI’s GPT-4, Gemini is a multimodal AI model, meaning it can process text, image, audio, and code input.

(For a long time ChatGPT was unimodal, only processing text, until this year it switched to multimodality.)

Gemini comes in three versions:

  • Nano: It is the least powerful version of Gemini, designed to run on mobile devices like phones and tablets. It’s best for simple, everyday tasks like compressing an audio file and writing email copy.
  • Pros: This version can handle more complex tasks such as language translation and designing marketing campaigns. This is the version that now powers Google’s AI tools like Bard and Google Assistant.
  • Ultra: The largest and most powerful version of Gemini, with access to large data sets and the processing power to perform tasks such as solving scientific problems and creating advanced AI applications.

Ultra isn’t yet available to consumers, with a rollout scheduled for early 2024 as Google conducts final tests to ensure it’s safe for commercial use. The Gemini Nano will power Google’s Pixel 8 Pro phone, which has built-in AI features.

On the other hand, Gemini Pro will run Google tools like Bard starting today and is available via API through Google AI Studio and Google Cloud Vertex AI.

Was Google’s Gemini demo a hoax?

Google released a six-minute YouTube demo of Gemini’s skills in language, game creation, logic and spatial thinking, cultural understanding and more.

If you watch the video, you will easily be amazed.

A Gemini is able to recognize a duck from a simple drawing, understand a clever trick, and complete visual puzzles—to name just a few tasks.

However, after earning more than 2 million views, a Bloomberg report revealed that the video was cut and spliced ​​which inflated Gemini’s performance.

Google shared a disclaimer at the start of the video: “For the purposes of this demonstration, latency has been reduced and Gemini outputs have been truncated for brevity.”

However, Bloomberg points out that they left out several important details:

  • The video isn’t done in real time or via voice output, which suggests that conversations won’t be as smooth as shown in the demo.
  • The model used in the video is the Gemini Ultra, which is not yet available to the public.

The way Gemini actually handled inputs in the demo was through photos and written instructions.

It’s like showing everyone your dog’s best trick.

Share the video via text and everyone is impressed. But when they’re all done, they see that it actually takes a whole bunch of treats and petting and patience and repetition 100 times to see this trick in action.

Let’s make a side-by-side comparison.

In this 8 second clip, we see a person gesturing with his hand as if playing a game used to settle all friendly disputes. The twins reply, “I know what you’re doing. You are playing rock-paper-scissors.

gemini demo

Image source

But what actually happened behind the scenes involved a lot more spoon-feeding.

In an actual demo, a user submitted each hand gesture individually and asked Gemini to describe what they saw.

google gemini demo

Image source

From there, the user combined all three images, asked Gemini again, and included a big tip.

google gemini demo

While it’s still impressive how well Gemini can process images and understand context, the video downplays how much control is required for Gemini to generate the right response.

While this has brought Google plenty of criticism, some point out that it’s not uncommon for companies to use editing to create more seamless, idealistic use cases in their demos.

Gemini vs. GPT-4

Until now, GPT-4, created by OpenAI, was the most powerful AI model on the market. Since then, Google and other AI players have been working hard to find a model that can beat it.

Google first teased Gemini in September, suggesting it would beat GPT-4 and technically, it did.

Gemini outperforms GPT-4 in a number of benchmarks set by artificial intelligence researchers.

gemini vs gpt-4

Image source

However, the Bloomberg article points out something important.

For a model that took so long to release, the fact that it’s only marginally better than GPT-4 isn’t the big win Google was aiming for.

OpenAI released GPT-4 in March. Google now releases Gemini, which is better, but only by a few percentage points.

So how long will it take for OpenAI to release an even bigger and better version? Judging by last year, it probably won’t be long.

For now, Gemini seems to be the better option, but that won’t be clear until early 2024 when the Ultra comes out.

Click here to subscribe to HubSpot's AI newsletter



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *