AITopics | blind user

Collaborating Authors

blind user

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EgoBlind: Towards Egocentric Visual Assistance for the Blind

Neural Information Processing SystemsJun-15-2026, 17:51:34 GMT

We present EgoBlind, the first egocentric VideoQA dataset collected from blind individuals to evaluate the assistive capabilities of contemporary multimodal large language models (MLLMs). EgoBlind comprises 1,392 first-person videos from the daily lives of blind and visually impaired individuals. It also features 5,311 questions directly posed or verified by the blind to reflect their in-situation needs for visual assistance. Each question has an average of 3 manually annotated reference answers to reduce subjectiveness. Using EgoBlind, we comprehensively evaluate 16 advanced MLLMs and find that all models struggle. The best performers achieve an accuracy near 60%, which is far behind human performance of 87.4%. To guide future advancements, we identify and summarize major limitations of existing MLLMs in egocentric visual assistance for the blind and explore heuristic solutions for improvement. With these efforts, we hope that EgoBlind will serve as a foundation for developing effective AI assistants to enhance the independence of the blind and visually impaired. Data and code are available at https://github.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > China (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

EgoBlind: Towards Egocentric Visual Assistance for the Blind People

Xiao, Junbin, Huang, Nanxin, Qiu, Hao, Tao, Zhulin, Yang, Xun, Hong, Richang, Wang, Meng, Yao, Angela

arXiv.org Artificial IntelligenceMar-11-2025

We present EgoBlind, the first egocentric VideoQA dataset collected from blind individuals to evaluate the assistive capabilities of contemporary multimodal large language models (MLLMs). EgoBlind comprises 1,210 videos that record the daily lives of real blind users from a first-person perspective. It also features 4,927 questions directly posed or generated and verified by blind individuals to reflect their needs for visual assistance under various scenarios. We provide each question with an average of 3 reference answers to alleviate subjective evaluation. Using EgoBlind, we comprehensively evaluate 15 leading MLLMs and find that all models struggle, with the best performers achieving accuracy around 56\%, far behind human performance of 87.4\%. To guide future advancements, we identify and summarize major limitations of existing MLLMs in egocentric visual assistance for the blind and provide heuristic suggestions for improvement. With these efforts, we hope EgoBlind can serve as a valuable foundation for developing more effective AI assistants to enhance the independence of the blind individuals' lives.

blind people, question type, video, (16 more...)

arXiv.org Artificial Intelligence

2503.08221

Country:

North America > United States (0.04)
Asia > Singapore (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Education (0.67)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Understanding How Blind Users Handle Object Recognition Errors: Strategies and Challenges

Hong, Jonggi, Kacorri, Hernisa

arXiv.org Artificial IntelligenceAug-6-2024

Object recognition technologies hold the potential to support blind and low-vision people in navigating the world around them. However, the gap between benchmark performances and practical usability remains a significant challenge. This paper presents a study aimed at understanding blind users' interaction with object recognition systems for identifying and avoiding errors. Leveraging a pre-existing object recognition system, URCam, fine-tuned for our experiment, we conducted a user study involving 12 blind and low-vision participants. Through in-depth interviews and hands-on error identification tasks, we gained insights into users' experiences, challenges, and strategies for identifying errors in camera-based assistive technologies and object recognition systems. During interviews, many participants preferred independent error review, while expressing apprehension toward misrecognitions. In the error identification task, participants varied viewpoints, backgrounds, and object sizes in their images to avoid and overcome errors. Even after repeating the task, participants identified only half of the errors, and the proportion of errors identified did not significantly differ from their first attempts. Based on these insights, we offer implications for designing accessible interfaces tailored to the needs of blind and low-vision users in identifying object recognition errors.

participant, proceedings, recognition, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3663548.3675635

2408.03303

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.05)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Personal > Interview (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine (1.00)
Automobiles & Trucks (0.92)
Information Technology > Security & Privacy (0.67)
Transportation > Ground > Road (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Toucha11y: Making Inaccessible Public Touchscreens Accessible

Li, Jiasheng, Yan, Zeyu, Shah, Arush, Lazar, Jonathan, Peng, Huaishu

arXiv.org Artificial IntelligenceMay-6-2023

Despite their growing popularity, many public kiosks with touchscreens are inaccessible to blind people. Toucha11y is a working prototype that allows blind users to use existing inaccessible touchscreen kiosks independently and with little effort. Toucha11y consists of a mechanical bot that can be instrumented to an arbitrary touchscreen kiosk by a blind user and a companion app on their smartphone. The bot, once attached to a touchscreen, will recognize its content, retrieve the corresponding information from a database, and render it on the user's smartphone. As a result, a blind person can use the smartphone's built-in accessibility features to access content and make selections. The mechanical bot will detect and activate the corresponding touchscreen interface. We present the system design of Toucha11y along with a series of technical evaluations. Through a user study, we found out that Toucha11y could help blind users operate inaccessible touchscreen devices.

artificial intelligence, interface, toucha11y, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3544548.3581254

2305.04097

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Germany > Hamburg (0.05)
(4 more...)

Genre: Questionnaire & Opinion Survey (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.69)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.68)
Law (0.68)
(2 more...)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

AI and Accessibility

Communications of the ACMMay-24-2020, 01:10:43 GMT

According to the World Health Organization, more than one billion people worldwide have disabilities. The field of disability studies defines disability through a social lens; people are disabled to the extent that society creates accessibility barriers. AI technologies offer the possibility of removing many accessibility barriers; for example, computer vision might help people who are blind better sense the visual world, speech recognition and translation technologies might offer real-time captioning for people who are hard of hearing, and new robotic systems might augment the capabilities of people with limited mobility. Considering the needs of users with disabilities can help technologists identify high-impact challenges whose solutions can advance the state of AI for all users; however, ethical challenges such as inclusivity, bias, privacy, error, expectation setting, simulated data, and social acceptability must be considered. The inclusivity of AI systems refers to whether they are effective for diverse user populations.

artificial intelligence, disability, natural language, (18 more...)

Communications of the ACM

AI-Alerts: 2020 > 2020-05 > AAAI AI-Alert for May 26, 2020 (1.00)

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications > Social Media (0.95)
Information Technology > Artificial Intelligence > Applied AI (0.69)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.35)

Add feedback

Google's Lookout app says what it sees for blind users in the US

EngadgetMar-13-2019, 10:56:08 GMT

Google's Lookout is now finally available for download, though it's only compatible with Pixel devices in the US set to English at the moment. The application was first announced at Google's annual I/O Conference in 2018 and was designed to help the blind and visually impaired navigate their surroundings. It comes with three modes: Explore, Shopping and Quick Read. Explore, its default mode, gives users audio cues about their environment, telling them if there's a chair or a cute dog blocking the way, for instance. Shopping can read barcodes and currency, giving users a way to, say, make sure they're truly holding a $5 bill.

artificial intelligence, google, image understanding, (5 more...)

Engadget

Country: North America > United States (0.64)

Technology: Information Technology > Artificial Intelligence > Vision > Image Understanding (0.40)

Add feedback

Blind users can now explore photos by touch with Microsoft's Seeing AI

#artificialintelligenceMar-13-2019, 07:01:47 GMT

Microsoft's Seeing AI is an app that lets blind and limited-vision folks convert visual data into audio feedback, and it just got a useful new feature. Users can now use touch to explore the objects and people in photos. It's powered by machine learning, of course, specifically object and scene recognition. All you need to do is take a photo or open one up in the viewer and tap anywhere on it. "This new feature enables users to tap their finger to an image on a touch-screen to hear a description of objects within an image and the spatial relationship between them," wrote Seeing AI lead Saqib Shaikh in a blog post. "The app can even describe the physical appearance of people and predict their mood."

artificial intelligence, explore photo, microsoft, (5 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (0.80)

Add feedback

Personalized Dynamics Models for Adaptive Assistive Navigation Interfaces

Ohn-Bar, Eshed, Kitani, Kris, Asakawa, Chieko

arXiv.org Machine LearningApr-11-2018

We explore the role of personalization for assistive navigational systems (e.g., service robot, wearable system or smartphone app) that guide visually impaired users through speech, sound and haptic-based instructional guidance. Based on our analysis of real-world users, we show that the dynamics of blind users cannot be accounted for by a single universal model but instead must be learned on an individual basis. To learn personalized instructional interfaces, we propose PING (Personalized INstruction Generation agent), a model-based reinforcement learning framework which aims to quickly adapt its state transition dynamics model to match the reactions of the user using a novel end-to-end learned weighted majority-based regression algorithm. In our experiments, we show that PING learns dynamics models significantly faster compared to baseline transfer learning approaches on real-world data. We find that through better reasoning over personal mobility nuances, interaction with surrounding obstacles, and the current navigation task, PING is able to improve the performance of instructional assistive navigation at the most crucial junctions such as turns or veering paths. To enable sufficient planning time over user responses, we emphasize prediction of human motion for long horizons. Specifically, the learned dynamics models are shown to consistently improve long-term position prediction by over 1 meter on average (nearly the width of a hallway) compared to baseline approaches even when considering a prediction horizon of 20 seconds into the future.

dynamic model, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1804.04118

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Microsoft's AI will describe images in Word and PowerPoint for blind users

#artificialintelligenceDec-3-2016, 15:00:10 GMT

Artificial intelligence may be making small and steady advances in general-purpose situations like digital assistants. But it's the more subtle AI accessibility features that have a more substantial impact today, especially for users with disabilities. For instance, an upcoming feature for Office apps like Microsoft Word and PowerPoint will automatically suggest image and slide deck captions, called alt-text, using AI algorithms. That way, when those files are presented to blind users, computer tools designed to translate the information onscreen into audio have text descriptions to work with. Microsoft is accomplishing this feat with its Computer Vision Cognitive Service, which uses neural networks trained with deep learning techniques to better understand and describe the contents of images.

artificial intelligence, machine learning, word and powerpoint, (7 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Is It Harmful When Advisors Only Pretend to Be Honest?

Wang, Dongxia (Nanyang Technological University) | Muller, Tim (Nanyang Technological University) | Zhang, Jie (Nanyang Technological University) | Liu, Yang (Nanyang Technological University)

AAAI ConferencesApr-19-2016

In trust systems, unfair rating attacks — where advisors provide ratings dishonestly — influence the accuracy of trust evaluation. A secure trust system should function properly under all possible unfair rating attacks; including dynamic attacks. In the literature, camouflage attacks are the most studied dynamic attacks. But an open question is whether more harmful dynamic attacks exist. We propose random processes to model and measure dynamic attacks. The harm of an attack is influenced by a user's ability to learn from the past. We consider three types of users: blind users, aware users, and general users. We found for all the three types, camouflage attacks are far from the most harmful. We identified the most harmful attacks, under which we found the ratings may still be useful to users.

artificial intelligence, attacker, information leakage, (15 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Industry:

Information Technology > Security & Privacy (0.68)
Government (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Security & Privacy (0.68)

Add feedback