Goto

Collaborating Authors

 descriptive data


Referring to Screen Texts with Voice Assistants

Bhargava, Shruti, Dhoot, Anand, Jonsson, Ing-Marie, Nguyen, Hoang Long, Patel, Alkesh, Yu, Hong, Renkens, Vincent

arXiv.org Artificial Intelligence

Voice assistants help users make phone calls, send messages, create events, navigate, and do a lot more. However, assistants have limited capacity to understand their users' context. In this work, we aim to take a step in this direction. Our work dives into a new experience for users to refer to phone numbers, addresses, email addresses, URLs, and dates on their phone screens. Our focus lies in reference understanding, which becomes particularly interesting when multiple similar texts are present on screen, similar to visual grounding. We collect a dataset and propose a lightweight general-purpose model for this novel experience. Due to the high cost of consuming pixels directly, our system is designed to rely on the extracted text from the UI. Our model is modular, thus offering flexibility, improved interpretability, and efficient runtime memory utilization.


AI may issue harsher punishments, severe judgments than humans: Study

FOX News

Chris Winfield, founder of Understanding A.I., tells'Fox & Friends Weekend' host Will Cain about a study showing patients preferred medical answers from artificial intelligence over doctors. Artificial intelligence fails to match humans in judgment calls and is more prone to issue harsher penalties and punishments for rule breakers, according to a new study from MIT researchers. The finding could have real world implications if AI systems are used to predict the likelihood of a criminal reoffending, which could lead to longer jail sentences or setting bail at a higher price tag, the study said. Researchers at the Massachusetts university, as well as Canadian universities and nonprofits, studied machine-learning models and found that when AI is not trained properly, it makes more severe judgment calls than humans. Human participants then labeled the photos or text, with their responses used to train AI systems.