arrow_right

Google I/O is going on right now, and the company has announced some amazing AI features and capabilities that will further cement Gemini as one of the most capable AI models on the planet. One of the most impressive new announcements was Project Astra, a tool that can actually interact with the world through sight.

Google also announced that the company is rolling out Gemini 1.5 Pro to more people. This is the company’s most advanced and powerful model, and it has a massive 1 million token limit. This makes it one of the most capable models to date, and more people are getting access to it. Google is making Gemini 1.5 Pro available to developers globally. Also, it is available to people using Gemini Advanced.

Google introduced Project Astra

Just one day before Google I/O, OpenAI had its Spring event. During the event, the company announced GPT-4o. This is OpenAI’s most advanced model today, and it has a neat feature called Vision. Using Vision, you can use the viewfinder in the ChatGPT app to observe the world and have ChatGPT respond in real-time. You can ask it questions about what it sees.

Well, Google just announced its own version of that called Project Astra. Using it, you’re able to use the viewfinder on your phone as a set of eyes for Gemini. During the example given, the presenter asked Gemini which one of their devices makes noise. After that, they pointed their viewfinder at a speaker. Gemini was able to understand what it was and point out that that was the device that made noise.

Not only that, but it was able to look at a screen displaying code and figure out what the code was for, develop a band name based on a tiger doll next to a dog, and even remember where the presenter’s glasses were.

Along with that, you can draw on the screen to indicate certain things. For example, the presenter drew an arrow to the tweeter on the aforementioned speaker, and Gemini was able to identify what it was.

Obviously, this is a feature that will be compared to GPT-4o. For the time being, it is not available. However, some of these capabilities will be making it to the Gemini app later this year.

AI glasses

During the presentation, we also got a sneak peek of a pair of AI glasses that Google may be experimenting with. With them, the presenter was able to do much the same thing that they were able to do with the Gemini app. She was able to look at the world and ask questions based on what the glasses saw. At this point, we have no idea when/if Google is going to launch these glasses.