Google's new Gemini 3 "vibe-codes" responses and comes with its own agent

Nov-18-2025, 16:00:07 GMT–MIT Technology Review

Google today unveiled Gemini 3, a major upgrade to its flagship multimodal model. The firm says the new model is better at reasoning, has more fluid multimodal capabilities (the ability to work across voice, text or images), and will work like an agent. The previous model, Gemini 2.5, supports multimodal input. Users can feed it images, handwriting, or voice. But it usually requires explicit instructions about the format the user wants back, and it defaults to plain text regardless. But Gemini 3 introduces what Google calls "generative interfaces," which allow the model to make its own choices about what kind of output fits the prompt best, assembling visual layouts and dynamic views on its own instead of returning a block of text. Ask for travel recommendations and it may spin up a website-like interface inside the app, complete with modules, images, and follow-up prompts such as "How many days are you traveling?" or "What kinds of activities do you enjoy?" It also presents clickable options based on what you might want next. When asked to explain a concept, Gemini 3 may sketch a diagram or generate a simple animation on its own if it believes a visual is more effective.

large language model, machine learning, natural language, (18 more...)

MIT Technology Review

Nov-18-2025, 16:00:07 GMT

News Web Page

Add feedback

Country:
- North America > United States > Massachusetts (0.05)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)