gboard
Google's Gemini AI is coming to Android
Google is bringing Gemini, the new large language model it just introduced, to Android, beginning with the Pixel 8 Pro. The company's flagship smartphone will run Gemini Nano, a version of the model built specifically to run locally on smaller devices, Google announced in a blog post. The Pixel 8 Pro is powered by the Google Tensor G3 chip designed to speed up AI performance. This lets the Pixel 8 Pro add several smarts to existing features. The phone's Recorder app, for instance, has a Summarize feature that currently needs a network connection to give you a summary of recorded conversations, interviews, and presentations.
Why isn't there more training on the edge? « Pete Warden's blog
One of the most frequent questions I get asked from people exploring machine learning beyond cloud and desktop machines is "What about training?". If you look around at the popular frameworks and use cases of edge ML, most of them seem focused on inference. It isn't obvious why this is the case though, so I decided to collect my notes in a post here, so I can have something to refer to when this comes up (and organize my own thoughts too!). I think the biggest reason that there's not more training on the edge is that most models need to be trained through supervised learning, that is each sample used for training needs a ground truth label. If you're running on a phone or embedded system, there's not likely to be an easy way to attach a label to incoming data, other than running an existing model and guessing.
Google may integrate AI text-to-image generator to Gboard for Android
Google is expected to introduce a host of AI features for its products in the near future, and among them, Gboard for Android is working to integrate the Imagen text-to-image generator, the media reported. In a recent APK (Android Package Kit) teardown, conducted by 9to5Google, the latest beta version of Gboard -- contains lines of code that mention an "Imagen Keyboard". This Imagen feature will appear in the shortcuts strip/page, like Clipboard, Translate, and One-handed. For people who are unfamiliar with Imagen, it is similar to the popular text-to-image generator DALL-E 2 -- which is owned by ChatGPT creator OpenAI. It is capable of creating images based on the request users submit to it, according to the report. However, Google's research found that more people preferred Imagen's results over DALL-E's.
Collaborative Learning: Next great frontiers in AI
The field of machine learning is constantly evolving, sometimes slowly, and at other times we experience the tech equivalent of the Cambrian Explosion with rapid advance that makes a good many data scientists experience a serious case of imposter syndrome. It has only been 8 years since the modern era of deep learning began at the 2012 ImageNet competition. Which novel AI approaches will unlock currently unimaginable possibilities in technology and business? This article highlights emerging areas within AI that are poised to redefine the field -- and society -- in the years ahead. Unsupervised learning more closely mirrors the way that humans learn about the world: through open-ended exploration and inference, without a need for the "training wheels" of supervised learning.
How Google's Android Keyboard Keeps 'Smart Replies' Private
Google has infused its so-called Smart Reply feature, which uses machine learning to suggest words and sentences you may want to type next, into various email products for the last several years. But with Android 11, those contextual nudges--including emojis and stickers--are built directly into Gboard, Google's popular keyboard app. They can follow you everywhere you type. Figuring out how to keep the AI that powers all of this from becoming a privacy nightmare. Google has been adamant for years that Gboard doesn't retain or send any data about your keystrokes.
Google squeezed an offline dictation AI into its keyboard app
Google has updated its Gboard keyboard app for Android with AI-powered dictation that works offline. The company says it's effectively miniaturized a cloud-based neural network system for speech recognition into an 80MB mobile app update, and that it'll allow for faster and more reliable dictation on the go. That's big, because it means you don't need your phone connecting to a server to deliver high-quality speech recognition results – and you also don't need to have access to a high-speed Wi-Fi network to use the feature. Google, Reddit, and Slack will be there. The new system has been in the works since 2014, and it eschews the traditional three-step process for speech recognition for a single-step solution. Typically, speech recognition software first maps spoken words into smaller segments of audio called phonemes, then connects these phonemes to form indexed words, and finally turns those into text.
Google's real-time speech recognition AI can run offline on Pixel
You can now dictate your texts with Google's Gboard keyboard even when you're offline, at least if you use a Pixel. Google's AI team announced that it updated the Gboard's speech recognizer to recognize characters one-by-one as they're spoken, and it is now hosted directly on the device. By no longer having to send data over the internet, Gboard's voice typing should now be faster and more reliable. Google explained in a blog post that it wanted to create a speech recognizer that was "compact enough to reside on a phone" and wouldn't be derailed by unreliable WiFi or mobile networks. Voice recognition traditionally works by breaking apart the words you speak into smaller parts known as phonemes, according to Science Line.
Gboard on Pixel phones now uses an on-device neural network for speech recognition
On-device machine learning algorithms afford plenty of advantages, namely low latency and availability -- because processing is performed locally as opposed to remotely on a server, connectivity has no bearing on performance. Google sees the wisdom in this: It today announced that Gboard, its cross-platform virtual keyboard app, now uses an end-to-end recognizer to power American English speech input on Pixel smartphones. "This means no more network latency or spottiness -- the new recognizer is always available, even when you are offline," Johan Schalkwyk, a fellow on Google's Speech Team, wrote in a blog post. "The model works at the character level, so that as you speak, it outputs words character-by-character, just as if someone was typing out what you say in real-time, and exactly as you'd expect from a keyboard dictation system." It's more complicated than it sounds.
Gboard AI May Now Be Able To Read Even Your Doctor's Handwriting
Developers responsible for the machine learning found in Google's Gboard application have created a new machine learning model that may be able to discern even some of the worst handwriting around, based on a recently reported explanation from the company. The AI-driven keyboard mode has improved substantially since its initial launch but the latest changes may be the biggest yet. New revisions to the machine learning behind the handwriting feature enabled by advances in AI have now resulted in new model architectures and training methodologies that allow an improvement of 20- to 40-percent over the previous iteration. The latest method is described in some detail in a new paper published by the company and seems exceptionally complex at first glance but may actually be much more intuitive. It's based on the introduction of recognition for touch points, Bézier curves, and recurrent neural networks (RNN) -- specifically quasi-recurrent neural networks (QRNN).