AITopics | iclr 2021

Collaborating Authors

iclr 2021

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Detecting LLM-Written Peer Reviews

Rao, Vishisht, Kumar, Aounon, Lakkaraju, Himabindu, Shah, Nihar B.

arXiv.org Artificial IntelligenceMar-19-2025

Editors of academic journals and program chairs of conferences require peer reviewers to write their own reviews. However, there is growing concern about the rise of lazy reviewing practices, where reviewers use large language models (LLMs) to generate reviews instead of writing them independently. Existing tools for detecting LLM-generated content are not designed to differentiate between fully LLM-generated reviews and those merely polished by an LLM. In this work, we employ a straightforward approach to identify LLM-generated reviews - doing an indirect prompt injection via the paper PDF to ask the LLM to embed a watermark. Our focus is on presenting watermarking schemes and statistical tests that maintain a bounded family-wise error rate, when a venue evaluates multiple reviews, with a higher power as compared to standard methods like Bonferroni correction. These guarantees hold without relying on any assumptions about human-written reviews. We also consider various methods for prompt injection including font embedding and jailbreaking. We evaluate the effectiveness and various tradeoffs of these methods, including different reviewer defenses. We find a high success rate in the embedding of our watermarks in LLM-generated reviews across models. We also find that our approach is resilient to common reviewer defenses, and that the bounds on error rates in our statistical tests hold in practice while having the power to flag LLM-generated reviews, while Bonferroni correction is infeasible.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.15772

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CPT: Efficient Deep Neural Network Training via Cyclic Precision

Fu, Yonggan, Guo, Han, Li, Meng, Yang, Xin, Ding, Yining, Chandra, Vikas, Lin, Yingyan Celine

arXiv.org Artificial IntelligenceJan-3-2025

Low-precision deep neural network (DNN) training has gained tremendous attention as reducing precision is one of the most effective knobs for boosting DNNs' training time/energy efficiency. In this paper, we attempt to explore low-precision training from a new perspective as inspired by recent findings in understanding DNN training: we conjecture that DNNs' precision might have a similar effect as the learning rate during DNN training, and advocate dynamic precision along the training trajectory for further boosting the time/energy efficiency of DNN training. Specifically, we propose Cyclic Precision Training (CPT) to cyclically vary the precision between two boundary values which can be identified using a simple precision range test within the first few training epochs. Extensive simulations and ablation studies on five datasets and eleven models demonstrate that CPT's effectiveness is consistent across various models/tasks (including classification and language modeling). Furthermore, through experiments and visualization we show that CPT helps to (1) converge to a wider minima with a lower generalization error and (2) reduce training variance which we believe opens up a new design knob for simultaneously improving the optimization and efficiency of DNN training. Our codes are available at: https://github.com/RICE-EIC/CPT.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2101.09868

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Geometry-Aware Algorithm to Learn Hierarchical Embeddings in Hyperbolic Space

Wang, Zhangyu, Xu, Lantian, Kong, Zhifeng, Wang, Weilong, Peng, Xuyu, Zheng, Enyang

arXiv.org Artificial IntelligenceJul-23-2024

Hyperbolic embeddings are a class of representation learning methods that offer competitive performances when data can be abstracted as a tree-like graph. However, in practice, learning hyperbolic embeddings of hierarchical data is difficult due to the different geometry between hyperbolic space and the Euclidean space. To address such difficulties, we first categorize three kinds of illness that harm the performance of the embeddings. Then, we develop a geometry-aware algorithm using a dilation operation and a transitive closure regularization to tackle these illnesses. We empirically validate these techniques and present a theoretical analysis of the mechanism behind the dilation operation. Experiments on synthetic and real-world datasets reveal superior performances of our algorithm.

algorithm, illness, nickel & kiela, (14 more...)

arXiv.org Artificial Intelligence

2407.16641

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.96)

Add feedback

Hierarchical Autoregressive Modeling for Neural Video Compression

Yang, Ruihan, Yang, Yibo, Marino, Joseph, Mandt, Stephan

arXiv.org Artificial IntelligenceDec-19-2023

Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models. We draw a connection between such autoregressive generative models and the task of lossy video compression. Specifically, we view recent neural video compression methods (Lu et al., 2019; Yang et al., 2020b; Agustssonet al., 2020) as instances of a generalized stochastic temporal autoregressive transform, and propose avenues for enhancement based on this insight. Comprehensive evaluations on large-scale video data show improved rate-distortion performance over both state-of-the-art neural and conventional video compression methods.

compression, international conference, video compression, (14 more...)

arXiv.org Artificial Intelligence

2010.10258

Country:

North America > United States > California (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.97)
Information Technology > Artificial Intelligence > Vision (0.95)
(2 more...)

Add feedback

Clairvoyance: A Pipeline Toolkit for Medical Time Series

Jarrett, Daniel, Yoon, Jinsung, Bica, Ioana, Qian, Zhaozhi, Ercole, Ari, van der Schaar, Mihaela

arXiv.org Artificial IntelligenceOct-28-2023

Time-series learning is the bread and butter of data-driven *clinical decision support*, and the recent explosion in ML research has demonstrated great potential in various healthcare settings. At the same time, medical time-series problems in the wild are challenging due to their highly *composite* nature: They entail design choices and interactions among components that preprocess data, impute missing values, select features, issue predictions, estimate uncertainty, and interpret models. Despite exponential growth in electronic patient data, there is a remarkable gap between the potential and realized utilization of ML for clinical research and decision support. In particular, orchestrating a real-world project lifecycle poses challenges in engineering (i.e. hard to build), evaluation (i.e. hard to assess), and efficiency (i.e. hard to optimize). Designed to address these issues simultaneously, Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a (i) software toolkit, (ii) empirical standard, and (iii) interface for optimization. Our ultimate goal lies in facilitating transparent and reproducible experimentation with complex inference workflows, providing integrated pathways for (1) personalized prediction, (2) treatment-effect estimation, and (3) information acquisition. Through illustrative examples on real-world data in outpatient, general wards, and intensive-care settings, we illustrate the applicability of the pipeline paradigm on core tasks in the healthcare journey. To the best of our knowledge, Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.

conference paper, pipeline, prediction, (16 more...)

arXiv.org Artificial Intelligence

2310.18688

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(3 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.66)
Research Report > Promising Solution (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning

Hüyük, Alihan, Jarrett, Daniel, van der Schaar, Mihaela

arXiv.org Machine LearningOct-28-2023

Understanding human behavior from observed data is critical for transparency and accountability in decision-making. Consider real-world settings such as healthcare, in which modeling a decision-maker's policy is challenging -- with no access to underlying states, no knowledge of environment dynamics, and no allowance for live experimentation. We desire learning a data-driven representation of decision-making behavior that (1) inheres transparency by design, (2) accommodates partial observability, and (3) operates completely offline. To satisfy these key criteria, we propose a novel model-based Bayesian method for interpretable policy learning ("Interpole") that jointly estimates an agent's (possibly biased) belief-update process together with their (possibly suboptimal) belief-action mapping. Through experiments on both simulated and real-world data for the problem of Alzheimer's disease diagnosis, we illustrate the potential of our approach as an investigative device for auditing, quantifying, and understanding human decision-making behavior.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Machine Learning

2310.19831

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Texas (0.04)
Europe > Netherlands (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(3 more...)

Add feedback

Continual Invariant Risk Minimization

Alesiani, Francesco, Yu, Shujian, Niepert, Mathias

arXiv.org Artificial IntelligenceOct-21-2023

Empirical risk minimization can lead to poor generalization behavior on unseen environments if the learned model does not capture invariant feature representations. Invariant risk minimization (IRM) is a recent proposal for discovering environment-invariant representations. IRM was introduced by Arjovsky et al. (2019) and extended by Ahuja et al. (2020). IRM assumes that all environments are available to the learning system at the same time. With this work, we generalize the concept of IRM to scenarios where environments are observed sequentially. We show that existing approaches, including those designed for continual learning, fail to identify the invariant features and models across sequentially presented environments. We extend IRM under a variational Bayesian and bilevel framework, creating a general approach to continual invariant risk minimization. We also describe a strategy to solve the optimization problems using a variant of the alternating direction method of multiplier (ADMM). We show empirically using multiple datasets and with multiple sequential environments that the proposed methods outperform or is competitive with prior approaches.

iclr 2021, robustml workshop paper, shorter version, (10 more...)

arXiv.org Artificial Intelligence

2310.13977

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

S2vNTM: Semi-supervised vMF Neural Topic Modeling

Xu, Weijie, Desai, Jay, Sengamedu, Srinivasan, Jiang, Xiaoyu, Iannacci, Francis

arXiv.org Artificial IntelligenceJul-6-2023

Language model based methods are powerful techniques for text classification. However, the models have several shortcomings. In this paper, we propose Semi-Supervised vMF Neural Topic Modeling (S2vNTM) to overcome these difficulties. S2vNTM takes a few seed keywords as input for topics. S2vNTM leverages the pattern of keywords to identify potential topics, as well as optimize the quality of topics' keywords sets. Across a variety of datasets, S2vNTM outperforms existing semi-supervised topic modeling methods in classification accuracy with limited keywords provided. S2vNTM is at least twice as fast as baselines. Language Model (LM) pre-training Vaswani et al. (2017); Devlin et al. (2018) has proven to be useful in learning universal language representations. Recent language models such as Yang et al. (2019); Sun et al. (2019); Chen et al. (2022); Ding et al. (2021) have achieved amazing results in text classification. Most of these methods need enough high-quality labels to train. To make LM based methods work well when limited labels are available, few shot learning methods such as Bianchi et al. (2021); Meng et al. (2020a;b); Mekala and Shang (2020); Yu et al. (2021); Wang et al. (2021b) have been proposed. However, these methods rely on large pre-trained texts and can be biased to apply to a different environment. Topic modeling methods generate topics based on the pattern of words.

machine learning, natural language, text classification, (17 more...)

arXiv.org Artificial Intelligence

2307.04804

Country:

Asia > Middle East > Jordan (0.04)
Asia > Middle East > Iraq (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.87)
(2 more...)

Add feedback

FFPDG: Fast, Fair and Private Data Generation

Xu, Weijie, Zhao, Jinjin, Iannacci, Francis, Wang, Bo

arXiv.org Artificial IntelligenceJun-30-2023

Generative modeling has been used frequently in synthetic data generation. Fairness and privacy are two big concerns for synthetic data. Although Recent GAN [Goodfellow et al. (2014)] based methods show good results in preserving privacy, the generated data may be more biased. At the same time, these methods require high computation resources. We show the effectiveness of our method theoretically and empirically. We show that models trained on data generated by the proposed method can perform well (in inference stage) on real application scenarios. Synthetic data [Rubin (1993)] is data that is artificially created rather than being generated by actual events.

artificial intelligence, machine learning, synthetic data, (19 more...)

arXiv.org Artificial Intelligence

2307.00161

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Efficient Training Under Limited Resources

Zolnouri, Mahdi, Lakhmiri, Dounia, Tribes, Christophe, Sari, Eyyüb, Digabel, Sébastien Le

arXiv.org Artificial IntelligenceJan-22-2023

Training time budget and size of the dataset are among the factors affecting the performance of a Deep Neural Network (DNN). This paper shows that Neural Architecture Search (NAS), Hyper Parameters Optimization (HPO), and Data Augmentation help DNNs perform much better while these two factors are limited. However, searching for an optimal architecture and the best hyperparameter values besides a good combination of data augmentation techniques under low resources requires many experiments. We present our approach to achieving such a goal in three steps: reducing training epoch time by compressing the model while maintaining the performance compared to the original model, preventing model overfitting when the dataset is small, and performing the hyperparameter tuning. We used NOMAD, which is a blackbox optimization software based on a derivative-free algorithm to do NAS and HPO. Our work achieved an accuracy of 86.0 % on a tiny subset of Mini-ImageNet at the ICLR 2021 Hardware Aware Efficient Training (HAET) Challenge and won second place in the competition. The competition results can be found at haet2021.github.io/challenge and our source code can be found at github.com/DouniaLakhmiri/ICLR\_HAET2021.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.09264

Country: North America > Canada > Quebec > Montreal (0.16)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback