AITopics | malt

Collaborating Authors

malt

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Semantic categories of artifacts and animals reflect efficient coding

Zaslavsky, Noga, Regier, Terry, Tishby, Naftali, Kemp, Charles

arXiv.org Artificial IntelligenceSep-15-2025

It has been argued that semantic categories across languages reflect pressure for efficient communication. Recently, this idea has been cast in terms of a general information-theoretic principle of efficiency, the Information Bottleneck (IB) principle, and it has been shown that this principle accounts for the emergence and evolution of named color categories across languages, including soft structure and patterns of inconsistent naming. However, it is not yet clear to what extent this account generalizes to semantic domains other than color. Here we show that it generalizes to two qualitatively different semantic domains: names for containers, and for animals. First, we show that container naming in Dutch and French is near-optimal in the IB sense, and that IB broadly accounts for soft categories and inconsistent naming patterns in both languages. Second, we show that a hierarchy of animal categories derived from IB captures cross-linguistic tendencies in the growth of animal taxonomies. Taken together, these findings suggest that fundamental information-theoretic principles of efficient coding may shape semantic categories across languages and across domains.

artificial intelligence, category, natural language, (18 more...)

arXiv.org Artificial Intelligence

1905.04562

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

MALT: Mechanistic Ablation of Lossy Translation in LLMs for a Low-Resource Language: Urdu

Bajwa, Taaha Saleem

arXiv.org Artificial IntelligenceJan-27-2025

LLMs are predominantly trained on English data, which leads to a significant drop in performance on low-resource languages. Understanding how LLMs handle these languages is crucial for improving their effectiveness. This study focuses on Urdu as a use case for exploring the challenges faced by LLMs in processing low-resource languages. LLMs primarily reason in English when prompted in another language, with the final layers acting as translators to convert the English response into the target language. This study finds that even for low-resource languages, the internal latent response of LLMs in English is quite coherent; however, the translation features are lossy and result in poor translations, leading to reduced performance. By mechanistically removing these translation features and using a separate translation model to translate the internal latent response of LLM, the performance of LLMs improves significantly while also preserving the cultural nuances of the input in low-resource languages.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.00041

Country: Asia > Pakistan (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

MALT: Improving Reasoning with Multi-Agent LLM Training

Motwani, Sumeet Ramesh, Smith, Chandler, Das, Rocktim Jyoti, Rybchuk, Markian, Torr, Philip H. S., Laptev, Ivan, Pizzati, Fabio, Clark, Ronald, de Witt, Christian Schroeder

arXiv.org Artificial IntelligenceDec-2-2024

Enabling effective collaboration among LLMs is a crucial step toward developing autonomous systems capable of solving complex problems. While LLMs are typically used as single-model generators, where humans critique and refine their outputs, the potential for jointly-trained collaborative models remains largely unexplored. Despite promising results in multi-agent communication and debate settings, little progress has been made in training models to work together on tasks. In this paper, we present a first step toward "Multi-agent LLM training" (MALT) on reasoning problems. Our approach employs a sequential multi-agent setup with heterogeneous LLMs assigned specialized roles: a generator, verifier, and refinement model iteratively solving problems. We propose a trajectory-expansion-based synthetic data generation process and a credit assignment strategy driven by joint outcome based rewards. This enables our post-training setup to utilize both positive and negative trajectories to autonomously improve each model's specialized capabilities as part of a joint sequential system. We evaluate our approach across MATH, GSM8k, and CQA, where MALT on Llama 3.1 8B models achieves relative improvements of 14.14%, 7.12%, and 9.40% respectively over the same baseline model. This demonstrates an early advance in multi-agent cooperative capabilities for performance on mathematical and common sense reasoning questions. More generally, our work provides a concrete direction for research around multi-agent LLM training approaches.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.01928

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

MALT Powers Up Adversarial Attacks

Melamed, Odelia, Yehudai, Gilad, Shamir, Adi

arXiv.org Machine LearningJul-2-2024

Neural networks are widely known to be susceptible to adversarial perturbations (Szegedy et al. [2013]), which are typically imperceptible by humans. Many different papers have shown how to construct such attacks, where adding a small perturbation to the input significantly changes the output of the model (Carlini and Wagner [2017], Papernot et al. [2017], Athalye et al. [2018]). To protect from these attacks, researchers have tried to develop more robust models using several different techniques, such as adversarial training using different attacks (Madry et al. [2017], Papernot et al. [2016], Liu et al. [2023], Wang et al. [2023]). The current state of the art adversarial attack, known as AutoAttack (Croce and Hein [2020b]), combines several different parameter-free attacks, some targeted and some untargeted. AutoAttack currently leads the RobustBench benchmark (Croce et al. [2020]), which is the standard benchmark for adversarial robustness. Notably, the targeted attacks used in AutoAttack pick the adversarial target classes according to the model's confidence levels and attack the top nine classes, even though CIFAR-100 and ImageNet have many more possible target classes. The reason for attacking only a limited number of classes, rather than all possible classes, is computational, as each such attack has a significant running time. The reason that adversarial examples exist remains a hot topic of debate, specifically, whether it is due to the highly non-linear landscape of neural networks or rather to their local linearity properties.

adversarial example, assumption, autoattack, (14 more...)

arXiv.org Machine Learning

2407.0224

Country:

Asia > Middle East > Israel (0.04)
North America > United States > New York (0.04)
Europe > Slovakia > Bratislava > Bratislava (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

MALT: Multi-scale Action Learning Transformer for Online Action Detection

Yang, Zhipeng, Wang, Ruoyu, Tan, Yang, Xie, Liping

arXiv.org Artificial IntelligenceMay-31-2024

Online action detection (OAD) aims to identify ongoing actions from streaming video in real-time, without access to future frames. Since these actions manifest at varying scales of granularity, ranging from coarse to fine, projecting an entire set of action frames to a single latent encoding may result in a lack of local information, necessitating the acquisition of action features across multiple scales. In this paper, we propose a multi-scale action learning transformer (MALT), which includes a novel recurrent decoder (used for feature fusion) that includes fewer parameters and can be trained more efficiently. A hierarchical encoder with multiple encoding branches is further proposed to capture multi-scale action features. The output from the preceding branch is then incrementally input to the subsequent branch as part of a cross-attention calculation. In this way, output features transition from coarse to fine as the branches deepen. We also introduce an explicit frame scoring mechanism employing sparse attention, which filters irrelevant frames more efficiently, without requiring an additional network. The proposed method achieved state-of-the-art performance on two benchmark datasets (THUMOS'14 and TVSeries), outperforming all existing models used for comparison, with an mAP of 0.2% for THUMOS'14 and an mcAP of 0.1% for TVseries.

action detection, detection, historical frame, (14 more...)

arXiv.org Artificial Intelligence

2405.20892

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.71)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Interpretable Causal Inference for Analyzing Wearable, Sensor, and Distributional Data

Katta, Srikar, Parikh, Harsh, Rudin, Cynthia, Volfovsky, Alexander

arXiv.org Artificial IntelligenceDec-16-2023

Many modern causal questions ask how treatments affect complex outcomes that are measured using wearable devices and sensors. Current analysis approaches require summarizing these data into scalar statistics (e.g., the mean), but these summaries can be misleading. For example, disparate distributions can have the same means, variances, and other statistics. Researchers can overcome the loss of information by instead representing the data as distributions. We develop an interpretable method for distributional data analysis that ensures trustworthy and robust decision-making: Analyzing Distributional Data via Matching After Learning to Stretch (ADD MALTS). We (i) provide analytical guarantees of the correctness of our estimation strategy, (ii) demonstrate via simulation that ADD MALTS outperforms other distributional data analysis methods at estimating treatment effects, and (iii) illustrate ADD MALTS' ability to verify whether there is enough cohesion between treatment and control units within subpopulations to trustworthily estimate treatment effects. We demonstrate ADD MALTS' utility by studying the effectiveness of continuous glucose monitors in mitigating diabetes risks.

malt, quantile function, treatment effect, (14 more...)

arXiv.org Artificial Intelligence

2312.10569

Country: North America > United States (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Add feedback

MALTS: Matching After Learning to Stretch

Parikh, Harsh, Rudin, Cynthia, Volfovsky, Alexander

arXiv.org Artificial IntelligenceJun-7-2023

We introduce a flexible framework that produces high-quality almost-exact matches for causal inference. Most prior work in matching uses ad-hoc distance metrics, often leading to poor quality matches, particularly when there are irrelevant covariates. In this work, we learn an interpretable distance metric for matching, which leads to substantially higher quality matches. The learned distance metric stretches the covariate space according to each covariate's contribution to outcome prediction: this stretching means that mismatches on important covariates carry a larger penalty than mismatches on irrelevant covariates. Our ability to learn flexible distance metrics leads to matches that are interpretable and useful for the estimation of conditional average treatment effects.

artificial intelligence, distance metric, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1811.07415

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
South America > Chile (0.04)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Education (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

#hardtoparse: POS Tagging and Parsing the Twitterverse

AAAI ConferencesAug-8-2011

We evaluate the statistical dependency parser, Malt, on a new dataset of sentences taken from tweets. We use a version of Malt which is trained on gold standard phrase structure Wall Street Journal (WSJ) trees converted to Stanford labelled dependencies. We observe a drastic drop in performance moving from our in-domain WSJ test set to the new Twitter dataset, much of which has to do with the propagation of part-of-speech tagging errors. Retraining Malt on dependency trees produced by a state-of-the-art phrase structure parser, which has itself been self-trained on Twitter material, results in a significant improvement. We analyse this improvement by examining in detail the effect of the retraining on individual dependency types.

parser, relation, twitterdev, (13 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

Europe > Ireland (0.04)
North America > United States > Pennsylvania (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)

Industry:

Information Technology > Services (0.47)
Media > News (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback