visual exploration
Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models
A long standing goal in neuroscience has been to elucidate the functional organization of the brain. Within higher visual cortex, functional accounts have remained relatively coarse, focusing on regions of interest (ROIs) and taking the form of selectivity for broad categories such as faces, places, bodies, food, or words. Because the identification of such ROIs has typically relied on manually assembled stimulus sets consisting of isolated objects in non-ecological contexts, exploring functional organization without robust a priori hypotheses has been challenging. To overcome these limitations, we introduce a data-driven approach in which we synthesize images predicted to activate a given brain region using paired natural images and fMRI recordings, bypassing the need for category-specific stimuli. Our approach -- Brain Diffusion for Visual Exploration (BrainDiVE) -- builds on recent generative methods by combining large-scale diffusion models with brain-guided image synthesis. Validating our method, we demonstrate the ability to synthesize preferred images with appropriate semantic specificity for well-characterized category-selective ROIs. We then show that BrainDiVE can characterize differences between ROIs selective for the same high-level category. Finally we identify novel functional subdivisions within these ROIs, validated with behavioral data. These results advance our understanding of the fine-grained functional organization of human visual cortex, and provide well-specified constraints for further examination of cortical organization using hypothesis-driven methods.
Automatic Screening of Parkinson's Disease from Visual Explorations
Alcala-Durand, Maria F., Puerta-Acevedo, J. Camilo, Arias-Londoño, Julián D., Godino-Llorente, Juan I.
Eye movements can reveal early signs of neurodegeneration, including those associated with Parkinson's Disease (PD). This work investigates the utility of a set of gaze-based features for the automatic screening of PD from different visual exploration tasks. For this purpose, a novel methodology is introduced, combining classic fixation/saccade oculomotor features (e.g., saccade count, fixation duration, scanned area) with features derived from gaze clusters (i.e., regions with a considerable accumulation of fixations). These features are automatically extracted from six exploration tests and evaluated using different machine learning classifiers. A Mixture of Experts ensemble is used to integrate outputs across tests and both eyes. Results show that ensemble models outperform individual classifiers, achieving an Area Under the Receiving Operating Characteristic Curve (AUC) of 0.95 on a held-out test set. The findings support visual exploration as a non-invasive tool for early automatic screening of PD.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.92)
- Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (0.88)
- Health & Medicine > Therapeutic Area > Musculoskeletal (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Effect of Haptic Feedback on Avoidance Behavior and Visual Exploration in Dynamic VR Pedestrian Environment
Ishibashi, Kyosuke, Saito, Atsushi, Tun, Zin Y., Ray, Lucas, Coram, Megan C., Sakurai, Akihiro, Okamura, Allison M., Yamamoto, Ko
Human crowd simulation in virtual reality (VR) is a powerful tool with potential applications including emergency evacuation training and assessment of building layout. While haptic feedback in VR enhances immersive experience, its effect on walking behavior in dense and dynamic pedestrian flows is unknown. Through a user study, we investigated how haptic feedback changes user walking motion in crowded pedestrian flows in VR. The results indicate that haptic feedback changed users' collision avoidance movements, as measured by increased walking trajectory length and change in pelvis angle. The displacements of users' lateral position and pelvis angle were also increased in the instantaneous response to a collision with a non-player character (NPC), even when the NPC was inside the field of view. Haptic feedback also enhanced users' awareness and visual exploration when an NPC approached from the side and back. Furthermore, variation in walking speed was increased by the haptic feedback. These results suggested that the haptic feedback enhanced users' sensitivity to a collision in VR environment.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Europe > Germany (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.91)
- Leisure & Entertainment > Games > Computer Games (0.34)
- Information Technology (0.34)
Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models
A long standing goal in neuroscience has been to elucidate the functional organization of the brain. Within higher visual cortex, functional accounts have remained relatively coarse, focusing on regions of interest (ROIs) and taking the form of selectivity for broad categories such as faces, places, bodies, food, or words. Because the identification of such ROIs has typically relied on manually assembled stimulus sets consisting of isolated objects in non-ecological contexts, exploring functional organization without robust a priori hypotheses has been challenging. To overcome these limitations, we introduce a data-driven approach in which we synthesize images predicted to activate a given brain region using paired natural images and fMRI recordings, bypassing the need for category-specific stimuli. Our approach -- Brain Diffusion for Visual Exploration ("BrainDiVE") -- builds on recent generative methods by combining large-scale diffusion models with brain-guided image synthesis.
Cluster Exploration using Informative Manifold Projections
Gerolymatos, Stavros, Evangelopoulos, Xenophon, Gusev, Vladimir, Goulermas, John Y.
Data exploration focuses on identifying informative patterns to discover new insight and knowledge about a collection of data. The often high-dimensional nature of such data renders the visual exploration process intractable for the human eye, and therefore specialized data manipulation of the original samples is essential in practice. Dimensionality reduction methods have been at the forefront of this challenge Bishop [2006] aiming to recover lower-dimensional embeddings of the original data that facilitate the identification of underlying data cohorts and help understand better the problem at hand. One of the most well known dimensionality reduction approaches perhaps is principal component analysis (PCA) Hotelling [1933], an efficient linear method aiming to maximizing the variance along the projection vectors, which in practice appears insufficient for meaningful separation of cohorts. A variety of non-linear methods have also been proposed that conversely focus on locally preserving the structure of the data such as Isomap Tenenbaum et al. [2000], LLE Roweis and Saul [2001], t-SNE van der Maaten and Hinton [2008], UMAP McInnes and Healy [2018], TriMap Amid and Warmuth [2019] and LargeVis Tang et al. [2016], etc. Projection pursuit (PP) Friedman and Tukey [1974], Caussinus and Ruiz-Gazen [2010] defines a family of dimensionality reduction methods that can enable various embedding effects depending on a suitably selected criterion. The kurtosis index Chiang et al. [2001] is one specific PP example that specializes in identifying "interesting" projections. Its minimization particularly penalizes the normality of the data distribution, promoting thus more meaningful separability when searching for clusters. The above approaches nevertheless share the same attribute of offering a single static projection that does not consider any prior knowledge a practitioner may have regarding the high-dimensional latent structure. Such projections can be uninformative as they tend to illustrate the most evident features which are often already known by the reader.
- Europe > United Kingdom > England > Merseyside > Liverpool (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain > Andalusia > Granada Province > Granada (0.04)
- Asia > India > West Bengal > Kolkata (0.04)
MultiCaM-Vis: Visual Exploration of Multi-Classification Model with High Number of Classes
Dilawer, Syed Ahsan Ali, Humayoun, Shah Rukh
Visual exploration of multi-classification models with large number of classes would help machine learning experts in identifying the root cause of a problem that occurs during learning phase such as miss-classification of instances. Most of the previous visual analytics solutions targeted only a few classes. In this paper, we present our interactive visual analytics tool, called MultiCaM-Vis, that provides \Emph{overview+detail} style parallel coordinate views and a Chord diagram for exploration and inspection of class-level miss-classification of instances. We also present results of a preliminary user study with 12 participants.
Contrastive Language-Image Pretrained Models are Zero-Shot Human Scanpath Predictors
Zanca, Dario, Zugarini, Andrea, Dietz, Simon, Altstidl, Thomas R., Ndjeuha, Mark A. Turban, Schwinn, Leo, Eskofier, Bjoern
Understanding the mechanisms underlying human attention is a fundamental challenge for both vision science and artificial intelligence. While numerous computational models of free-viewing have been proposed, less is known about the mechanisms underlying task-driven image exploration. To address this gap, we present CapMIT1003, a database of captions and click-contingent image explorations collected during captioning tasks. CapMIT1003 is based on the same stimuli from the well-known MIT1003 benchmark, for which eye-tracking data under free-viewing conditions is available, which offers a promising opportunity to concurrently study human attention under both tasks. We make this dataset publicly available to facilitate future research in this field. In addition, we introduce NevaClip, a novel zero-shot method for predicting visual scanpaths that combines contrastive language-image pretrained (CLIP) models with biologically-inspired neural visual attention (NeVA) algorithms. NevaClip simulates human scanpaths by aligning the representation of the foveated visual stimulus and the representation of the associated caption, employing gradient-driven visual exploration to generate scanpaths. Our experimental results demonstrate that NevaClip outperforms existing unsupervised computational models of human visual attention in terms of scanpath plausibility, for both captioning and free-viewing tasks. Furthermore, we show that conditioning NevaClip with incorrect or misleading captions leads to random behavior, highlighting the significant impact of caption guidance in the decision-making process. These findings contribute to a better understanding of mechanisms that guide human attention and pave the way for more sophisticated computational approaches to scanpath prediction that can integrate direct top-down guidance of downstream tasks.
Cognitive Analytics Answers the Question: What's Interesting in Your Data?
Dimensionality reduction is a critical component of any solution dealing with massive data collections. Being able to sift through a mountain of data efficiently in order to find the key descriptive, predictive and explanatory features of the collection is a fundamental required capability for coping with the avalanche of data that all organizations are collecting. Identifying the most interesting dimensions of data is especially critical when integrating high-dimensional (high-variety) complex data sources, then attempting to visualize the most insightful patterns in the data, and then telling the data's story to your stakeholders. There is a "good news, bad news" angle here. First, the bad news: the human capacity for visualizing multiple dimensions is very limited: 3 or 4 dimensions are manageable; 5 or 6 dimensions are possible (e.g., with colors and symbols); but more dimensions are difficult-to-impossible to assimilate.
A Survey of Embodied AI: From Simulators to Research Tasks
Duan, Jiafei, Yu, Samson, Tan, Hui Li, Zhu, Hongyuan, Tan, Cheston
There has been an emerging paradigm shift from the era of "internet AI" to "embodied AI", whereby AI algorithms and agents no longer simply learn from datasets of images, videos or text curated primarily from the internet. Instead, they learn through embodied physical interactions with their environments, whether real or simulated. Consequently, there has been substantial growth in the demand for embodied AI simulators to support a diversity of embodied AI research tasks. This growing interest in embodied AI is beneficial to the greater pursuit of artificial general intelligence, but there is no contemporary and comprehensive survey of this field. This paper comprehensively surveys state-of-the-art embodied AI simulators and research, mapping connections between these. By benchmarking nine state-of-the-art embodied AI simulators in terms of seven features, this paper aims to understand the simulators in their provision for use in embodied AI research. Finally, based upon the simulators and a pyramidal hierarchy of embodied AI research tasks, this paper surveys the main research tasks in embodied AI -- visual exploration, visual navigation and embodied question answering (QA), covering the state-of-the-art approaches, evaluation and datasets.
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Overview (1.00)
- Research Report (0.83)
- Leisure & Entertainment > Games (1.00)
- Education (0.67)
Cognitive Analytics Answers the Question: What's Interesting in Your Data?
Dimensionality reduction is a critical component of any solution dealing with massive data collections. Being able to sift through a mountain of data efficiently in order to find the key descriptive, predictive and explanatory features of the collection is a fundamental required capability for coping with the Big Data avalanche. Identifying the most interesting dimensions of data is especially valuable when visualizing high-dimensional (high-variety) big data and when telling your data's story. There is a "good news, bad news" angle here. First, the bad news: the human capacity for visualizing multiple dimensions is very limited: 3 or 4 dimensions are manageable; 5 or 6 dimensions are possible; but more dimensions are difficult-to-impossible to assimilate. Now for the good news: the human cognitive ability to detect patterns, anomalies, changes, or other "features" in a large complex "scene" surpasses most computer algorithms for speed and effectiveness.