AITopics | Banff

Collaborating Authors

Banff

Data efficiency, dimensionality reduction, and the generalized symmetric information bottleneck

arXiv.org Artificial IntelligenceFeb-2-2024

The Symmetric Information Bottleneck (SIB), an extension of the more familiar Information Bottleneck, is a dimensionality reduction technique that simultaneously compresses two random variables to preserve information between their compressed versions. We introduce the Generalized Symmetric Information Bottleneck (GSIB), which explores different functional forms of the cost of such simultaneous reduction. We then explore the dataset size requirements of such simultaneous compression. We do this by deriving bounds and root-mean-squared estimates of statistical fluctuations of the involved loss functions. We show that, in typical situations, the simultaneous GSIB compression requires qualitatively less data to achieve the same errors compared to compressing variables one at a time. We suggest that this is an example of a more general principle that simultaneous compression is more data efficient than independent compression of each of the input variables.

information, information bottleneck, loss function, (13 more...)

arXiv.org Artificial Intelligence

2309.05649

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.14)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback

Universal Imitation Games

Mahadevan, Sridhar

arXiv.org Artificial IntelligenceFeb-1-2024

Alan Turing proposed in 1950 a framework called an imitation game to decide if a machine could think. Using mathematics developed largely after Turing -- category theory -- we analyze a broader class of universal imitation games (UIGs), which includes static, dynamic, and evolutionary games. In static games, the participants are in a steady state. In dynamic UIGs, "learner" participants are trying to imitate "teacher" participants over the long run. In evolutionary UIGs, the participants are competing against each other in an evolutionary game, and participants can go extinct and be replaced by others with higher fitness. We use the framework of category theory -- in particular, two influential results by Yoneda -- to characterize each type of imitation game. Universal properties in categories are defined by initial and final objects. We characterize dynamic UIGs where participants are learning by inductive inference as initial algebras over well-founded sets, and contrast them with participants learning by conductive inference over the final coalgebra of non-well-founded sets. We briefly discuss the extension of our categorical framework for UIGs to imitation games on quantum computers.

arbitrary category, dinatural transformation, universal coalgebra jacob, (15 more...)

arXiv.org Artificial Intelligence

2405.0154

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.13)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(19 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.92)
Information Technology (0.87)
(2 more...)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
(13 more...)

Add feedback

Specialized Language Models with Cheap Inference from Limited Domain Data

Grangier, David, Katharopoulos, Angelos, Ablin, Pierre, Hannun, Awni

arXiv.org Artificial IntelligenceFeb-1-2024

Large language models have emerged as a versatile tool but are challenging to apply to tasks lacking large inference budgets and large in-domain training sets. This work formalizes these constraints and distinguishes four important variables: the pretraining budget (for training before the target domain is known), the specialization budget (for training after the target domain is known), the inference budget, and the in-domain training set size. Across these settings, we compare different approaches from the machine learning literature. Limited by inference cost, we find better alternatives to the standard practice of training very large vanilla transformer models. In particular, we show that hyper-networks and mixture of experts have better perplexity for large pretraining budgets, while small models trained on importance sampled datasets are attractive for large specialization budgets.

perplexity, specialization, specific perplexity specific perplexity, (13 more...)

arXiv.org Artificial Intelligence

2402.01093

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > France (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry: Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Graph Transformers without Positional Encodings

Garg, Ayush

arXiv.org Artificial IntelligenceJan-31-2024

Recently, Transformers for graph representation learning have become increasingly popular, achieving state-of-the-art performance on a wide-variety of datasets, either alone or in combination with message-passing graph neural networks (MP-GNNs). Infusing graph inductive-biases in the innately structure-agnostic transformer architecture in the form of structural or positional encodings (PEs) is key to achieving these impressive results. However, designing such encodings is tricky and disparate attempts have been made to engineer such encodings including Laplacian eigenvectors, relative random-walk probabilities (RRWP), spatial encodings, centrality encodings, edge encodings etc. In this work, we argue that such encodings may not be required at all, provided the attention mechanism itself incorporates information about the graph structure. We introduce Eigenformer, which uses a novel spectrum-aware attention mechanism cognizant of the Laplacian spectrum of the graph, and empirically show that it achieves performance comparable to SOTA MP-GNN architectures and Graph Transformers on a number of standard GNN benchmark datasets, even surpassing the SOTA on some datasets. We also find that our architecture is much faster to train in terms of number of epochs, presumably due to the innate graph inductive biases.

graph transformer, international conference, openreview, (12 more...)

arXiv.org Artificial Intelligence

2401.17791

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Human-Centric Goal Reasoning with Ripple-Down Rules

Brameld, Kenji, Castro, Germán, Sammut, Claude, Roberts, Mark, Aha, David W.

arXiv.org Artificial IntelligenceJan-30-2024

ActorSim is a goal reasoning framework developed at the Naval Research Laboratory. Originally, all goal reasoning rules were hand-crafted. This work extends ActorSim with the capability of learning by demonstration, that is, when a human trainer disagrees with a decision made by the system, the trainer can take over and show the system the correct decision. The learning component uses Ripple-Down Rules (RDR) to build new decision rules to correctly handle similar cases in the future. The system is demonstrated using the RoboCup Rescue Agent Simulation, which simulates a city-wide disaster, requiring emergency services, including fire, ambulance and police, to be dispatched to different sites to evacuate civilians from dangerous situations. The RDRs are implemented in a scripting language, FrameScript, which is used to mediate between ActorSim and the agent simulator. Using Ripple-Down Rules, ActorSim can scale to an order of magnitude more goals than the previous version.

agent, distribution statement, trainer, (17 more...)

arXiv.org Artificial Intelligence

2402.10224

Country:

North America > Canada > Quebec > Montreal (0.05)
Oceania > Australia > New South Wales (0.04)
North America > United States > New Hampshire > Rockingham County > Portsmouth (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Government > Military > Navy (0.69)
Government > Regional Government > North America Government > United States Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.69)
(2 more...)

Add feedback

GT-PCA: Effective and Interpretable Dimensionality Reduction with General Transform-Invariant Principal Component Analysis

Heinrichs, Florian

arXiv.org Artificial IntelligenceJan-28-2024

Data analysis often requires methods that are invariant with respect to specific transformations, such as rotations in case of images or shifts in case of images and time series. While principal component analysis (PCA) is a widely-used dimension reduction technique, it lacks robustness with respect to these transformations. Modern alternatives, such as autoencoders, can be invariant with respect to specific transformations but are generally not interpretable. We introduce General Transform-Invariant Principal Component Analysis (GT-PCA) as an effective and interpretable alternative to PCA and autoencoders. We propose a neural network that efficiently estimates the components and show that GT-PCA significantly outperforms alternative methods in experiments based on synthetic and real data.

autoencoder 0, gt-pca, resmse, (12 more...)

arXiv.org Artificial Intelligence

2401.15623

Country:

Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Asia (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)

Add feedback

Pre-training and Diagnosing Knowledge Base Completion Models

Kocijan, Vid, Jang, Myeongjun Erik, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceJan-27-2024

In this work, we introduce and analyze an approach to knowledge transfer from one collection of facts to another without the need for entity or relation matching. The method works for both canonicalized knowledge bases and uncanonicalized or open knowledge bases, i.e., knowledge bases where more than one copy of a real-world entity or relation may exist. The main contribution is a method that can make use of large-scale pre-training on facts, which were collected from unstructured text, to improve predictions on structured data from a specific domain. The introduced method is most impactful on small datasets such as ReVerb20k, where a 6% absolute increase of mean reciprocal rank and 65% relative decrease of mean rank over the previously best method was achieved, despite not relying on large pre-trained models like Bert. To understand the obtained pre-trained models better, we then introduce a novel dataset for the analysis of pre-trained models for Open Knowledge Base Completion, called Doge (Diagnostics of Open knowledge Graph Embeddings). It consists of 6 subsets and is designed to measure multiple properties of a pre-trained model: robustness against synonyms, ability to perform deductive reasoning, presence of gender stereotypes, consistency with reverse relations, and coverage of different areas of general knowledge. Using the introduced dataset, we show that the existing OKBC models lack consistency in the presence of synonyms and inverse relations and are unable to perform deductive reasoning. Moreover, their predictions often align with gender stereotypes, which persist even when presented with counterevidence. We additionally investigate the role of pre-trained word embeddings and demonstrate that avoiding biased word embeddings is not a sufficient measure to prevent biased behavior of OKBC models.

dataset, knowledge base, relation, (12 more...)

arXiv.org Artificial Intelligence

2401.15439

Country:

Africa > South Africa (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Russia (0.14)
(23 more...)

Genre: Research Report (1.00)

Industry:

Government (0.67)
Energy (0.67)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

The Adaptive Architectural Layout: How the Control of a Semi-Autonomous Mobile Robotic Partition was Shared to Mediate the Environmental Demands and Resources of an Open-Plan Office

Nguyen, Binh Vinh Duc, Moere, Andrew Vande

arXiv.org Artificial IntelligenceJan-25-2024

A typical open-plan office layout is unable to optimally host multiple collocated work activities, personal needs, and situational events, as its space exerts a range of environmental demands on workers in terms of maintaining their acoustic, visual or privacy comfort. As we hypothesise that these demands could be coped by optimising the environmental resources of the architectural layout, we deployed a mobile robotic partition that autonomously manoeuvres between predetermined locations. During a five-weeks in-the-wild study within a real-world open-plan office, we studied how 13 workers adopted four distinct adaptation strategies when sharing the spatiotemporal control of the robotic partition. Based on their logged and self-reported reasoning, we present six initiation regulating factors that determine the appropriateness of each adaptation strategy. This study thus contributes to how future human-building interaction could autonomously improve the experience, comfort, performance, and even the health and wellbeing of multiple workers that share the same workplace.

adaptation, environmental demand, participant, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3613904.3642465

2401.14078

Country:

North America > United States > New York > New York County > New York City (0.06)
North America > United States > Hawaii > Honolulu County > Honolulu (0.05)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
(28 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Consumer Health (1.00)
Information Technology (0.92)
Construction & Engineering (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.67)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Explainability-Driven Leaf Disease Classification Using Adversarial Training and Knowledge Distillation

Echim, Sebastian-Vasile, Tăiatu, Iulian-Marius, Cercel, Dumitru-Clementin, Pop, Florin

arXiv.org Artificial IntelligenceJan-23-2024

This work focuses on plant leaf disease classification and explores three crucial aspects: adversarial training, model explainability, and model compression. The models' robustness against adversarial attacks is enhanced through adversarial training, ensuring accurate classification even in the presence of threats. Leveraging explainability techniques, we gain insights into the model's decision-making process, improving trust and transparency. Additionally, we explore model compression techniques to optimize computational efficiency while maintaining classification performance. Through our experiments, we determine that on a benchmark dataset, the robustness can be the price of the classification accuracy with performance reductions of 3%-20% for regular tests and gains of 50%-70% for adversarial attack tests. We also demonstrate that a student model can be 15-25 times more computationally efficient for a slight performance reduction, distilling the knowledge of more complex models.

classification, distillation, knowledge distillation, (14 more...)

arXiv.org Artificial Intelligence

2401.00334

Country:

Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Provably Scalable Black-Box Variational Inference with Structured Variational Families

Ko, Joohwan, Kim, Kyurae, Kim, Woo Chang, Gardner, Jacob R.

arXiv.org Artificial IntelligenceJan-19-2024

Variational families with full-rank covariance approximations are known not to work well in black-box variational inference (BBVI), both empirically and theoretically. In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e.g. mean field families. This is particularly critical to hierarchical Bayesian models with local variables; their dimensionality increases with the size of the datasets. Consequently, one gets an iteration complexity with an explicit $\mathcal{O}(N^2)$ dependence on the dataset size $N$. In this paper, we explore a theoretical middle ground between mean-field variational families and full-rank families: structured variational families. We rigorously prove that certain scale matrix structures can achieve a better iteration complexity of $\mathcal{O}(N)$, implying better scaling with respect to $N$. We empirically verify our theoretical results on large-scale hierarchical models.

bbvi, complexity, inference, (15 more...)

arXiv.org Artificial Intelligence

2401.10989

Country:

Asia > Middle East > Jordan (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania (0.04)
(8 more...)

Genre: Research Report > New Finding (0.46)

Industry: Transportation > Air (0.61)

Add feedback