AITopics

Generating novel crystalline materials has potential to lead to advancements in fields such as electronics, energy storage, and catalysis. The defining characteristic of crystals is their symmetry, which plays a central role in determining their physical properties. However, existing crystal generation methods either fail to generate materials that display the symmetries of real-world crystals, or simply replicate the symmetry information from examples in a database. To address this limitation, we propose SymmCD, a novel diffusion-based generative model that explicitly incorporates crystallographic symmetry into the generative process. We decompose crystals into two components and learn their joint distribution through diffusion: 1) the asymmetric unit, the smallest subset of the crystal which can generate the whole crystal through symmetry transformations, and; 2) the symmetry transformations needed to be applied to each atom in the asymmetric unit. We also use a novel and interpretable representation for these transformations, enabling generalization across different crystallographic symmetry groups. We showcase the competitive performance of SymmCD on a subset of the Materials Project, obtaining diverse and valid crystals with realistic symmetries and predicted properties.

representation, space group, symmetry, (16 more...)

2502.03638

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
Africa > Togo (0.04)
North America > United States > North Carolina (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Energy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering

Li, Zhuowei, Shi, Haizhou, Gao, Yunhe, Liu, Di, Wang, Zhenting, Chen, Yuxiao, Liu, Ting, Zhao, Long, Wang, Hao, Metaxas, Dimitris N.

Large Vision-Language Models (LVLMs) can reason effectively over both textual and visual inputs, but they tend to hallucinate syntactically coherent yet visually ungrounded contents. In this paper, we investigate the internal dynamics of hallucination by examining the tokens logits rankings throughout the generation process, revealing three key patterns in how LVLMs process information: (1) gradual visual information loss -- visually grounded tokens gradually become less favored throughout generation, and (2) early excitation -- semantically meaningful tokens achieve peak activation in the layers earlier than the final layer. (3) hidden genuine information -- visually grounded tokens though not being eventually decided still retain relatively high rankings at inference. Based on these insights, we propose VISTA (Visual Information Steering with Token-logit Augmentation), a training-free inference-time intervention framework that reduces hallucination while promoting genuine information. VISTA works by combining two complementary approaches: reinforcing visual information in activation space and leveraging early layer activations to promote semantically meaningful decoding. Compared to existing methods, VISTA requires no external supervision and is applicable to various decoding strategies. Extensive experiments show that VISTA on average reduces hallucination by abount 40% on evaluated open-ended generation task, and it consistently outperforms existing methods on four benchmarks across four architectures under three decoding strategies.

hallucination, vista, visual information steering, (12 more...)

2502.03628

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment > Sports (1.00)
Transportation > Ground (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

A Beautiful Mind: Principles and Strategies for AI-Augmented Human Reasoning

Koon, Sean

T he past century ha s witnessed incredible technological change . The many benefits and conveniences o f technology are accompanied by new complexities and human challenges that affect work, home, social, and civic realms. Th ere is a w idening gap "between a growing complexity of our own making and a lagging development of our own capacities" (Botkin et al., 1998) . Now, artificial intelligence promises to increase the rate of scientific discovery and innovation exponentially, creating new changes and p otential complexities to which humans must adapt (Friedman, 2017) . On the other hand, new AI tools, especially generative AI models, may help people to engage with the growing volume and complexity of information in their reasoning tasks such as decisionmaking and problem solving.

intelligence, reasoning, reasoning tool, (16 more...)

2503.1553

Country:

North America > United States > California (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
North America > Canada (0.04)
(3 more...)

Genre:

Research Report (0.82)
Overview (0.67)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

Elucidation of the Concept of Consciousness from the Theory of Non-Human Communication Agents

Tagnin, Julian

This article focuses on elucidating the concept of consciousness from a relational and post-phenomenological theory of non-human communication agents (ANHC). Specifically, we explore the contributions of Thomas Metzinger s Self Model Theory, Katherine Hayles conceptualizations of non-conscious cognitive processes centered on knowledge processing phenomena shared between biological and technical systems and Lenore and Manuel Blum s theoretical perspective on computation, which defines consciousness as an emergent phenomenon of complex computational systems, arising from the appropriate organization of their inorganic materiality. Building on interactions with non-human cognitive agents, among other factors, the explainability of sociotechnical systems challenges the humanistic common sense of modern philosophy and science. This critical integration of various approaches ultimately questions other concepts associated with consciousness, such as autonomy, freedom, and mutual responsibility. The aim is to contribute to a necessary discussion for designing new frameworks of understanding that pave the way toward an ethical and pragmatic approach to addressing contemporary challenges in the design, regulation, and interaction with ANHC. Such frameworks, in turn, enable a more inclusive and relational understanding of agency in an interconnected world.

consciousness, met zinger, tion, (16 more...)

2502.03508

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > Mexico (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Jordanou, Jean Panaioti, Camponogara, Eduardo, Gildin, Eduardo

Identifying Large-Scale Linear Parameter Varying Systems with Dynamic Mode Decomposition Methods

Linear Parameter Varying (LPV) Systems are a well-established class of nonlinear systems with a rich theory for stability analysis, control, and analytical response finding, among other aspects. Although there are works on data-driven identification of such systems, the literature is quite scarce in terms of works that tackle the identification of LPV models for large-scale systems. Since large-scale systems are ubiquitous in practice, this work develops a methodology for the local and global identification of large-scale LPV systems based on nonintrusive reduced-order modeling. The developed method is coined as DMD-LPV for being inspired in the Dynamic Mode Decomposition (DMD). To validate the proposed identification method, we identify a system described by a discretized linear diffusion equation, with the diffusion gain defined by a polynomial over a parameter. The experiments show that the proposed method can easily identify a reduced-order LPV model of a given large-scale system without the need to perform identification in the full-order dimension, and with almost no performance decay over performing a reduction, given that the model structure is well-established.

artificial intelligence, identification, machine learning, (17 more...)

2502.02336

Country:

South America > Brazil (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Sargeant, Holli, Magnusson, Måns

Formalising Anti-Discrimination Law in Automated Decision Systems

arXiv.org Machine LearningFeb-4-2025

Algorithmic discrimination is a critical concern as machine learning models are used in high-stakes decision-making in legally protected contexts. Although substantial research on algorithmic bias and discrimination has led to the development of fairness metrics, several critical legal issues remain unaddressed in practice. To address these gaps, we introduce a novel decision-theoretic framework grounded in anti-discrimination law of the United Kingdom, which has global influence and aligns more closely with European and Commonwealth legal systems. We propose the 'conditional estimation parity' metric, which accounts for estimation error and the underlying data-generating process, aligning with legal standards. Through a real-world example based on an algorithmic credit discrimination case, we demonstrate the practical application of our formalism and provide insights for aligning fairness metrics with legal principles. Our approach bridges the divide between machine learning fairness metrics and anti-discrimination law, offering a legally grounded framework for developing non-discriminatory automated decision systems.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

2407.004

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > India (0.14)
(33 more...)

Genre: Research Report (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.92)

scBIT: Integrating Single-cell Transcriptomic Data into fMRI-based Prediction for Alzheimer's Disease Diagnosis

Huang, Yu-An, Hu, Yao, Li, Yue-Chao, Cao, Xiyue, Li, Xinyuan, Tan, Kay Chen, You, Zhu-Hong, Huang, Zhi-An

Functional MRI (fMRI) and single-cell transcriptomics are pivotal in Alzheimer's disease (AD) research, each providing unique insights into neural function and molecular mechanisms. However, integrating these complementary modalities remains largely unexplored. Here, we introduce scBIT, a novel method for enhancing AD prediction by combining fMRI with single-nucleus RNA (snRNA). scBIT leverages snRNA as an auxiliary modality, significantly improving fMRI-based prediction models and providing comprehensive interpretability. It employs a sampling strategy to segment snRNA data into cell-type-specific gene networks and utilizes a self-explainable graph neural network to extract critical subgraphs. Additionally, we use demographic and genetic similarities to pair snRNA and fMRI data across individuals, enabling robust cross-modal learning. Extensive experiments validate scBIT's effectiveness in revealing intricate brain region-gene associations and enhancing diagnostic prediction accuracy. By advancing brain imaging transcriptomics to the single-cell level, scBIT sheds new light on biomarker discovery in AD research. Experimental results show that incorporating snRNA data into the scBIT model significantly boosts accuracy, improving binary classification by 3.39% and five-class classification by 26.59%. The codes were implemented in Python and have been released on GitHub (https://github.com/77YQ77/scBIT) and Zenodo (https://zenodo.org/records/11599030) with detailed instructions.

artificial intelligence, dataset, machine learning, (19 more...)

2502.0263

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

Liang, Feng, Ma, Haoyu, He, Zecheng, Hou, Tingbo, Hou, Ji, Li, Kunpeng, Dai, Xiaoliang, Juefei-Xu, Felix, Azadi, Samaneh, Sinha, Animesh, Zhang, Peizhao, Vajda, Peter, Marculescu, Diana

Video personalization, which generates customized videos using reference images, has gained significant attention. However, prior methods typically focus on single-concept personalization, limiting broader applications that require multi-concept integration. Attempts to extend these models to multiple concepts often lead to identity blending, which results in composite characters with fused attributes from multiple sources. This challenge arises due to the lack of a mechanism to link each concept with its specific reference image. We address this with anchored prompts, which embed image anchors as unique tokens within text prompts, guiding accurate referencing during generation. Additionally, we introduce concept embeddings to encode the order of reference images. Our approach, Movie Weaver, seamlessly weaves multiple concepts-including face, body, and animal images-into one video, allowing flexible combinations in a single model. The evaluation shows that Movie Weaver outperforms existing methods for multi-concept video personalization in identity preservation and overall quality.

artificial intelligence, machine learning, natural language, (18 more...)

2502.07802

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Asia (0.04)

Genre: Research Report (0.82)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning

Yan, Yibo, Wang, Shen, Huo, Jiahao, Ye, Jingheng, Chu, Zhendong, Hu, Xuming, Yu, Philip S., Gomes, Carla, Selman, Bart, Wen, Qingsong

Scientific reasoning, the process through which humans apply logic, evidence, and critical thinking to explore and interpret scientific phenomena, is essential in advancing knowledge reasoning across diverse fields. However, despite significant progress, current scientific reasoning models still struggle with generalization across domains and often fall short of multimodal perception. Multimodal Large Language Models (MLLMs), which integrate text, images, and other modalities, present an exciting opportunity to overcome these limitations and enhance scientific reasoning. Therefore, this position paper argues that MLLMs can significantly advance scientific reasoning across disciplines such as mathematics, physics, chemistry, and biology. First, we propose a four-stage research roadmap of scientific reasoning capabilities, and highlight the current state of MLLM applications in scientific reasoning, noting their ability to integrate and reason over diverse data types. Second, we summarize the key challenges that remain obstacles to achieving MLLM's full potential. To address these challenges, we propose actionable insights and suggestions for the future. Overall, our work offers a novel perspective on MLLM integration with scientific reasoning, providing the LLM community with a valuable vision for achieving Artificial General Intelligence (AGI).

large language model, machine learning, natural language, (18 more...)

2502.02871

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > Thailand > Bangkok > Bangkok (0.04)
Oceania > Australia > New South Wales (0.04)
(9 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Education > Educational Setting (0.46)
Education > Curriculum (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Can You Move These Over There? An LLM-based VR Mover for Supporting Object Manipulation

Wang, Xiangzhi Eric, Sin, Zackary P. T., Jia, Ye, Archer, Daniel, Fong, Wynonna H. Y., Li, Qing, Li, Chen

In our daily lives, we can naturally convey instructions for the spatial manipulation of objects using words and gestures. Transposing this form of interaction into virtual reality (VR) object manipulation can be beneficial. We propose VR Mover, an LLM-empowered solution that can understand and interpret the user's vocal instruction to support object manipulation. By simply pointing and speaking, the LLM can manipulate objects without structured input. Our user study demonstrates that VR Mover enhances user usability, overall experience and performance on multi-object manipulation, while also reducing workload and arm fatigue. Users prefer the proposed natural interface for broad movements and may complementarily switch to gizmos or virtual hands for finer adjustments. These findings are believed to contribute to design implications for future LLM-based object manipulation interfaces, highlighting the potential for more intuitive and efficient user interactions in VR environments.

large language model, machine learning, natural language, (18 more...)

2502.02201

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Asia > China > Hong Kong (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(5 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.87)

Industry:

Health & Medicine (0.93)
Information Technology (0.67)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)