AITopics | supplementary figure

Collaborating Authors

supplementary figure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Energy-based Autoregressive Generation for Neural Population Dynamics

Ge, Ningling, Dai, Sicheng, Zhu, Yu, Yu, Shan

arXiv.org Artificial IntelligenceNov-25-2025

Understanding brain function represents a fundamental goal in neuroscience, with critical implications for therapeutic interventions and neural engineering applications. Computational modeling provides a quantitative framework for accelerating this understanding, but faces a fundamental trade-off between computational efficiency and high-fidelity modeling. To address this limitation, we introduce a novel Energy-based Autoregressive Generation (EAG) framework that employs an energy-based transformer learning temporal dynamics in latent space through strictly proper scoring rules, enabling efficient generation with realistic population and single-neuron spiking statistics. Evaluation on synthetic Lorenz datasets and two Neural Latents Benchmark datasets (MC Maze and Area2 bump) demonstrates that EAG achieves state-of-the-art generation quality with substantial computational efficiency improvements, particularly over diffusion-based methods. Beyond optimal performance, conditional generation applications show two capabilities: generalizing to unseen behavioral contexts and improving motor brain-computer interface decoding accuracy using synthetic neural data. These results demonstrate the effectiveness of energy-based modeling for neural population dynamics with applications in neuroscience research and neural engineering.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2511.17606

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Appendix A Supplementary figures

Neural Information Processing SystemsOct-3-2025, 01:03:41 GMT

Compared with MHG, our proposed grammars have better generalization ability. In comparison, our grammar is based on neighboring relationships. From the 220,011 training molecules, we obtained 1,775 production rules. Each molecule is associated with 28 production rules on the average. The maximum number of production rules associated with a molecule is 51.

artificial intelligence, molecule, production rule, (17 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Analysis of Error Sources in LLM-based Hypothesis Search for Few-Shot Rule Induction

Parab, Aishni, Lu, Hongjing, Wu, Ying Nian, Gulwani, Sumit

arXiv.org Artificial IntelligenceSep-3-2025

Inductive reasoning enables humans to infer abstract rules from limited examples and apply them to novel situations. In this work, we compare an LLM-based hypothesis search framework with direct program generation approaches on few-shot rule induction tasks. Our findings show that hypothesis search achieves performance comparable to humans, while direct program generation falls notably behind. An error analysis reveals key bottlenecks in hypothesis generation and suggests directions for advancing program induction methods. Overall, this paper underscores the potential of LLM-based hypothesis search for modeling inductive reasoning and the challenges in building more efficient systems.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2509.01016

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Appendix 2 A Materials and Methods 3 A.1 Natural Scenes Dataset

Neural Information Processing SystemsAug-14-2025, 07:55:40 GMT

Last layer of every residual stage (res1, res2, res3, res4) and avgpool Table A.2: Details of the task-optimized DNNs used as baselines 2 A.4 Characterizing the spatial tuning of early visual areas: Polar angle agreement

dataset, response-optimized model, voxel, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Vision (0.69)

Add feedback

Differential learning kinetics govern the transition from memorization to generalization during in-context learning

Nguyen, Alex, Reddy, Gautam

arXiv.org Artificial IntelligenceDec-12-2024

Transformers exhibit in-context learning (ICL): the ability to use novel information presented in the context without additional weight updates. Recent work shows that ICL emerges when models are trained on a sufficiently diverse set of tasks and the transition from memorization to generalization is sharp with increasing task diversity. One interpretation is that a network's limited capacity to memorize favors generalization. Here, we examine the mechanistic underpinnings of this transition using a small transformer applied to a synthetic ICL task. Using theory and experiment, we show that the sub-circuits that memorize and generalize can be viewed as largely independent. The relative rates at which these sub-circuits learn explains the transition from memorization to generalization, rather than capacity constraints. We uncover a memorization scaling law, which determines the task diversity threshold at which the network generalizes. The theory quantitatively explains a variety of other ICL-related phenomena, including the long-tailed distribution of when ICL is acquired, the bimodal behavior of solutions close to the task diversity threshold, the influence of contextual and data distributional statistics on ICL, and the transient nature of ICL.

icl, sequence, transition, (14 more...)

arXiv.org Artificial Intelligence

2412.00104

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Privacy-preserving federated prediction of pain intensity change based on multi-center survey data

Das, Supratim, Rafie, Mahdie, Kammer, Paula, Skou, Søren T., Grønne, Dorte T., Roos, Ewa M., Hajek, André, König, Hans-Helmut, Ullaha, Md Shihab, Probul, Niklas, Baumbacha, Jan, Baumbach, Linda

arXiv.org Artificial IntelligenceSep-12-2024

Background: Patient-reported survey data are used to train prognostic models aimed at improving healthcare. However, such data are typically available multi-centric and, for privacy reasons, cannot easily be centralized in one data repository. Models trained locally are less accurate, robust, and generalizable. We present and apply privacy-preserving federated machine learning techniques for prognostic model building, where local survey data never leaves the legally safe harbors of the medical centers. Methods: We used centralized, local, and federated learning techniques on two healthcare datasets (GLA:D data from the five health regions of Denmark and international SHARE data of 27 countries) to predict two different health outcomes. We compared linear regression, random forest regression, and random forest classification models trained on local data with those trained on the entire data in a centralized and in a federated fashion. Results: In GLA:D data, federated linear regression (R2 0.34, RMSE 18.2) and federated random forest regression (R2 0.34, RMSE 18.3) models outperform their local counterparts (i.e., R2 0.32, RMSE 18.6, R2 0.30, RMSE 18.8) with statistical significance. We also found that centralized models (R2 0.34, RMSE 18.2, R2 0.32, RMSE 18.5, respectively) did not perform significantly better than the federated models. In SHARE, the federated model (AC 0.78, AUROC: 0.71) and centralized model (AC 0.84, AUROC: 0.66) perform significantly better than the local models (AC: 0.74, AUROC: 0.69). Conclusion: Federated learning enables the training of prognostic models from multi-center surveys without compromising privacy and with only minimal or no compromise regarding model performance.

federated learning, federated model, local model, (14 more...)

arXiv.org Artificial Intelligence

2409.07997

Country:

Oceania > Australia (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Denmark > Southern Denmark (0.05)
(29 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.79)
Information Technology > Data Science > Data Mining > Big Data (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Add feedback

Mutagenesis screen to map the functionals of parameters of Large Language Models

Hu, Yue, Hu, Kai, Zhao, Patrick X., Khan, Javed, Xu, Chengming

arXiv.org Artificial IntelligenceAug-21-2024

Large Language Models (LLMs) have significantly advanced artificial intelligence, excelling in numerous tasks. Although the functionality of a model is inherently tied to its parameters, a systematic method for exploring the connections between the parameters and the functionality are lacking. Models sharing similar structure and parameter counts exhibit significant performance disparities across various tasks, prompting investigations into the varying patterns that govern their performance. We adopted a mutagenesis screen approach inspired by the methods used in biological studies, to investigate Llama2-7b and Zephyr. This technique involved mutating elements within the models' matrices to their maximum or minimum values to examine the relationship between model parameters and their functionalities. Our research uncovered multiple levels of fine structures within both models. Many matrices showed a mixture of maximum and minimum mutations following mutagenesis, but others were predominantly sensitive to one type. Notably, mutations that produced phenotypes, especially those with severe outcomes, tended to cluster along axes. Additionally, the location of maximum and minimum mutations often displayed a complementary pattern on matrix in both models, with the Gate matrix showing a unique two-dimensional asymmetry after rearrangement. In Zephyr, certain mutations consistently resulted in poetic or conversational rather than descriptive outputs. These "writer" mutations grouped according to the high-frequency initial word of the output, with a marked tendency to share the row coordinate even when they are in different matrices. Our findings affirm that the mutagenesis screen is an effective tool for deciphering the complexities of large language models and identifying unexpected ways to expand their potential, providing deeper insights into the foundational aspects of AI systems.

experiment, matrix, mutation, (15 more...)

arXiv.org Artificial Intelligence

2408.11494

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)

Add feedback

Investigating the dissemination of STEM content on social media with computational tools

Oshinowo, Oluwamayokun, Delgado, Priscila, Fay, Meredith, Luna, C. Alessandra, Dissanayaka, Anjana, Jeltuhin, Rebecca, Myers, David R.

arXiv.org Artificial IntelligenceApr-25-2024

These authors contributed equally to this work *Corresponding author. Abstract: Social media platforms can quickly disseminate STEM content to diverse audiences, but their operation can be mysterious. We used open-source machine learning methods such as clustering, regression, and sentiment analysis to analyze over 1000 videos and metrics thereof from 6 social media STEM creators. Our data provide insights into how audiences generate interest signals(likes, bookmarks, comments, shares), on the correlation of various signals with views, and suggest that content from newer creators is disseminated differently. We also share insights on how to optimize dissemination by analyzing data available exclusively to content creators as well as via sentiment analysis of comments. Introduction: Social media platforms such as Instagram, TikTok, and YouTube provide a new venue to promote STEM education, inspire the next generation of diverse scientists, and share knowledge to lower barriers to academia(1-3). Unlike many existing venues, social media is broadly accessible and not limited to those with significant resources devoted to their education. Content can be quickly disseminated to large diverse audiences of all ages and backgrounds(4).

creator, interest signal, video, (17 more...)

arXiv.org Artificial Intelligence

2404.18944

Country: North America > United States > Georgia > Fulton County > Atlanta (0.05)

Genre: Research Report (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.48)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Protein language model rescue mutations highlight variant effects and structure in clinically relevant genes

Soylemez, Onuralp, Cordero, Pablo

arXiv.org Artificial IntelligenceNov-17-2022

Despite being self-supervised, protein language models have shown remarkable performance in fundamental biological tasks such as predicting impact of genetic variation on protein structure and function. The effectiveness of these models on diverse set of tasks suggests that they learn meaningful representations of fitness landscape that can be useful for downstream clinical applications. Here, we interrogate the use of these language models in characterizing known pathogenic mutations in curated, medically actionable genes through an exhaustive search of putative compensatory mutations on each variant's genetic background. Systematic analysis of the predicted effects of these compensatory mutations reveal unappreciated structural features of proteins that are missed by other structure predictors like AlphaFold. While deep mutational scan experiments provide an unbiased estimate of the mutational landscape, we encourage the community to generate and curate rescue mutation experiments to inform the design of more sophisticated co-masking strategies and leverage large language models more effectively for downstream clinical prediction tasks.

large language model, natural language, variant, (16 more...)

arXiv.org Artificial Intelligence

2211.1

Genre: Research Report > Experimental Study (0.35)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)

Add feedback

Leveraging variational autoencoders for multiple data imputation

Roskams-Hieter, Breeshey, Wells, Jude, Wade, Sara

arXiv.org Artificial IntelligenceSep-30-2022

Missing data persists as a major barrier to data analysis across numerous applications. Recently, deep generative models have been used for imputation of missing data, motivated by their ability to capture highly non-linear and complex relationships in the data. In this work, we investigate the ability of deep models, namely variational autoencoders (VAEs), to account for uncertainty in missing data through multiple imputation strategies. We find that VAEs provide poor empirical coverage of missing data, with underestimation and overconfident imputations, particularly for more extreme missing data values. To overcome this, we employ $\beta$-VAEs, which viewed from a generalized Bayes framework, provide robustness to model misspecification. Assigning a good value of $\beta$ is critical for uncertainty calibration and we demonstrate how this can be achieved using cross-validation. In downstream tasks, we show how multiple imputation with $\beta$-VAEs can avoid false discoveries that arise as artefacts of imputation.

data quality, imputation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.15321

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia (0.04)

Genre: Research Report (0.83)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback