AITopics | supplementary section

Collaborating Authors

supplementary section

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviews: Semantic-Guided Multi-Attention Localization for Zero-Shot Learning

Neural Information Processing SystemsJan-21-2025, 23:46:48 GMT

The problem is relevant and the method is based on an interesting attention based idea to look at different regions in the image for the task of ZSL The losses used focus on (i) making each attention map peaky, while making different maps diverse, (ii) embedding based softmax for better prediction and (iii) class center triplet loss which makes the features closer to their respective class centers relative to the other class centers. Line 190 mentions that the image and parts are sent to "separate backbone networks", which implies that the network parameters are not shared. If that is the case then the method will have 3x parameters cf competing methods ie. a significantly higher capacity network overall. What happens when the CNN params are shared? And what happens when the image only baseline has a higher capacity network backbone (which is also then end-to-end finetuned)?

semantic-guided multi-attention localization, supplementary section, zero-shot learning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.54)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.43)

Add feedback

Engineering software 2.0 by interpolating neural networks: unifying training, solving, and calibration

Park, Chanwook, Saha, Sourav, Guo, Jiachen, Xie, Xiaoyu, Mojumder, Satyajit, Bessa, Miguel A., Qian, Dong, Chen, Wei, Wagner, Gregory J., Cao, Jian, Liu, Wing Kam

arXiv.org Artificial IntelligenceApr-22-2024

The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpolation theories and tensor decomposition, the interpolating neural network (INN). Instead of interpolating training data, a common notion in computer science, INN interpolates interpolation points in the physical space whose coordinates and values are trainable. It can also extrapolate if the interpolation points reside outside of the range of training data and the interpolation functions have a larger support domain. INN features orders of magnitude fewer trainable parameters, faster training, a smaller memory footprint, and higher model accuracy compared to feed-forward neural networks (FFNN) or physics-informed neural networks (PINN). INN is poised to usher in Engineering Software 2.0, a unified neural network that spans various domains of space, time, parameters, and initial/boundary conditions. This has previously been computationally prohibitive due to the exponentially growing number of trainable parameters, easily exceeding the parameter size of ChatGPT, which is over 1 trillion. INN addresses this challenge by leveraging tensor decomposition and tensor product, with adaptable network architecture.

inn, interpolation point, neural network, (13 more...)

arXiv.org Artificial Intelligence

2404.10296

Country:

North America > United States > Illinois > Cook County > Evanston (0.06)
North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Africa > Senegal > Kolda Region > Kolda (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Predicting Generalization of AI Colonoscopy Models to Unseen Data

Shor, Joel, McNeil, Carson, Intrator, Yotam, Ledsam, Joseph R, Yamano, Hiro-o, Tsurumaru, Daisuke, Kayama, Hiroki, Hamabe, Atsushi, Ando, Koji, Ota, Mitsuhiko, Ogino, Haruei, Nakase, Hiroshi, Kobayashi, Kaho, Miyo, Masaaki, Oki, Eiji, Takemasa, Ichiro, Rivlin, Ehud, Goldenberg, Roman

arXiv.org Artificial IntelligenceMar-22-2024

$\textbf{Background}$: Generalizability of AI colonoscopy algorithms is important for wider adoption in clinical practice. However, current techniques for evaluating performance on unseen data require expensive and time-intensive labels. $\textbf{Methods}$: We use a "Masked Siamese Network" (MSN) to identify novel phenomena in unseen data and predict polyp detector performance. MSN is trained to predict masked out regions of polyp images, without any labels. We test MSN's ability to be trained on data only from Israel and detect unseen techniques, narrow-band imaging (NBI) and chromendoscoy (CE), on colonoscopes from Japan (354 videos, 128 hours). We also test MSN's ability to predict performance of Computer Aided Detection (CADe) of polyps on colonoscopies from both countries, even though MSN is not trained on data from Japan. $\textbf{Results}$: MSN correctly identifies NBI and CE as less similar to Israel whitelight than Japan whitelight (bootstrapped z-test, |z| > 496, p < 10^-8 for both) using the label-free Frechet distance. MSN detects NBI with 99% accuracy, predicts CE better than our heuristic (90% vs 79% accuracy) despite being trained only on whitelight, and is the only method that is robust to noisy labels. MSN predicts CADe polyp detector performance on in-domain Israel and out-of-domain Japan colonoscopies (r=0.79, 0.37 respectively). With few examples of Japan detector performance to train on, MSN prediction of Japan performance improves (r=0.56). $\textbf{Conclusion}$: Our technique can identify distribution shifts in clinical data and can predict CADe detector performance on unseen data, without labels. Our self-supervised approach can aid in detecting when data in practice is different from training, such as between hospitals or data has meaningfully shifted from training. MSN has potential for application to medical image domains beyond colonoscopy.

msn, representation, supplementary section, (15 more...)

arXiv.org Artificial Intelligence

2403.0992

Country:

Asia > Middle East > Israel (0.66)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.05)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.05)
North America > United States > California (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Data Science (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Stability-Aware Training of Neural Network Interatomic Potentials with Differentiable Boltzmann Estimators

Raja, Sanjeev, Amin, Ishan, Pedregosa, Fabian, Krishnapriyan, Aditi S.

arXiv.org Artificial IntelligenceFeb-21-2024

Neural network interatomic potentials (NNIPs) are an attractive alternative to ab-initio methods for molecular dynamics (MD) simulations. However, they can produce unstable simulations which sample unphysical states, limiting their usefulness for modeling phenomena occurring over longer timescales. To address these challenges, we present Stability-Aware Boltzmann Estimator (StABlE) Training, a multimodal training procedure which combines conventional supervised training from quantum-mechanical energies and forces with reference system observables, to produce stable and accurate NNIPs. StABlE Training iteratively runs MD simulations to seek out unstable regions, and corrects the instabilities via supervision with a reference observable. The training procedure is enabled by the Boltzmann Estimator, which allows efficient computation of gradients required to train neural networks to system observables, and can detect both global and local instabilities. We demonstrate our methodology across organic molecules, tetrapeptides, and condensed phase systems, along with using three modern NNIP architectures. In all three cases, StABlE-trained models achieve significant improvements in simulation stability and recovery of structural and dynamic observables. In some cases, StABlE-trained models outperform conventional models trained on datasets 50 times larger. As a general framework applicable across NNIP architectures and systems, StABlE Training is a powerful tool for training stable and accurate NNIPs, particularly in the absence of large reference datasets. Molecular dynamics (MD) simulation is a staple method of computational science, enabling high-resolution spatiotemporal modeling of atomistic systems throughout biology, chemistry, and materials science [21]. Under the Born-Oppenheimer approximation, system evolution is governed by the underlying potential energy surface (PES), which is a function of the nuclear Cartesian coordinates [11]. While the atomic forces needed for MD simulation can be obtained on-the-fly via ab-initio quantum-mechanical (QM) calculations [12], the unfavorable scaling of this approach makes it prohibitively expensive for realistic system sizes and timescales [22]. There is a long history of using machine learning (ML) approaches in place of ab-initio methods to efficiently approximate the global PES [7, 6, 2, 55]. NNIPs, typically parameterized as graph neural networks [56, 33], are trained by matching energy and forces of a molecule or material from a reference dataset of QM calculations, such as Density Functional Theory (DFT) [31]. NNIPs trained on large ab-initio datasets are increasingly being used to model challenging and important chemical systems with favorable results [45, 37, 15, 57, 64, 43, 14, 3, 36, 60, 26, 19].

md simulation, simulation, stable training, (14 more...)

arXiv.org Artificial Intelligence

2402.13984

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Government > Regional Government (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery

Cheng, Yuxiao, Wang, Ziqian, Xiao, Tingxiong, Zhong, Qin, Suo, Jinli, He, Kunlun

arXiv.org Machine LearningOct-2-2023

Time-series causal discovery (TSCD) is a fundamental problem of machine learning. However, existing synthetic datasets cannot properly evaluate or predict the algorithms' performance on real data. This study introduces the CausalTime pipeline to generate time-series that highly resemble the real data and with ground truth causal graphs for quantitative performance evaluation. The pipeline starts from real observations in a specific scenario and produces a matching benchmark dataset. Firstly, we harness deep neural networks along with normalizing flow to accurately capture realistic dynamics. Secondly, we extract hypothesized causal graphs by performing importance analysis on the neural network or leveraging prior knowledge. Thirdly, we derive the ground truth causal graphs by splitting the causal model into causal term, residual term, and noise term. Lastly, using the fitted network and the derived causal graph, we generate corresponding versatile time-series proper for algorithm assessment. In the experiments, we validate the fidelity of the generated data through qualitative and quantitative experiments, followed by a benchmarking of existing TSCD algorithms using these generated datasets. CausalTime offers a feasible solution to evaluating TSCD algorithms in real applications and can be generalized to a wide range of fields. For easy use of the proposed approach, we also provide a user-friendly website, hosted on www.causaltime.cc.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Machine Learning

2310.01753

Country:

Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Topic Discovery through Data Dependent and Random Projections

Ding, Weicong, Rohban, Mohammad H., Ishwar, Prakash, Saligrama, Venkatesh

arXiv.org Machine LearningMar-18-2013

We present algorithms for topic modeling based on the geometry of cross-document word-frequency patterns. This perspective gains significance under the so called separability condition. This is a condition on existence of novel-words that are unique to each topic. We present a suite of highly efficient algorithms based on data-dependent and random projections of word-frequency patterns to identify novel words and associated topics. We will also discuss the statistical guarantees of the data-dependent projections method based on two mild assumptions on the prior density of topic document matrix. Our key insight here is that the maximum and minimum values of cross-document frequency patterns projected along any direction are associated with novel words. While our sample complexity bounds for topic recovery are similar to the state-of-art, the computational complexity of our random projection scheme scales linearly with the number of documents and the number of words per document. We present several experiments on synthetic and real-world datasets to demonstrate qualitative and quantitative merits of our scheme.

machine learning, natural language, novel word, (16 more...)

arXiv.org Machine Learning

1303.3664

Country: