AITopics | weasel

Collaborating Authors

weasel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendix

Neural Information Processing SystemsApr-24-2026, 16:50:13 GMT

In this section we motivate the design choices and inductive biases that we encode into our neural encoder network e, which is the network that is used to model the relative accuracies of the weak supervision sources λ. Recall that we model the probability of a particular sample x X having the class label y Y = {1,...,C}as Pθ(y|λ) = softmax(s)yP(y), (4) s = θ(λ,x)Tλ RC . Connection to prior PGM models We now motivate this choice by deriving a less expressive variant of it from the standard Markov Random Field (MRF) used in the related work. If we view the attention scores θ(λ,x) Rm, that assign sample-dependent accuracies to each labeling function, as sample-independent parameters θ1 and, by that, drop the features from the equation - as is done in the related work [30, 32, 19, 11] - we can rewrite Eq. 4 as exp θT1 1 {λ = y} P We can recognize Pθ as a distribution from the exponential familiy, and more specifically as a pairwise MRF, or factor graph, with canonical parameters θ = (θ1,θ2) and corresponding sufficient statistics, or factors, φ(λ,y) = (φ1(λ,y),φ2(λ)), as well as the log partition function Zθ. The accuracy factors and parameters φ1,θ1 are the core component of this model and sometimes take the form φ1(λy) = λy in binary models as in [30, 19, 11]. The label-independent factors φ2(λ) have, as can be seen from the derivation above, no direct influence on the latent label posterior, but are often used to model labeling propensities 1 {λ 6= 0}and correlation dependencies 1 {λi = λj}, which can be important for PGM parameter learning, but are susceptible to misspecifications [39, 11, 8].

artificial intelligence, experiment, machine learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.66)

Add feedback

Appendix A Posterior Reparameterization

Neural Information Processing SystemsFeb-7-2026, 12:07:14 GMT

artificial intelligence, experiment, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

End-to-End Weak Supervision Carnegie Mellon University 2

Neural Information Processing SystemsFeb-7-2026, 12:07:10 GMT

Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art approaches that do not use any labeled training data, however, require two separate modeling steps: Learning a probabilistic latent variable model based on the WS sources - making assumptions that rarely hold in practice - followed by downstream model training. Importantly, the first step of modeling does not consider the performance of the downstream model. To address these caveats we propose an end-to-end approach for directly learning the downstream model by maximizing its agreement with probabilistic labels generated by reparameterizing prior probabilistic posteriors with a neural network. Our results show improved performance over prior work in terms of end model performance on downstream test sets, as well as in terms of improved robustness to dependencies among weak supervision sources.

artificial intelligence, downstream model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.88)

Add feedback

FiTs: Fine-grained Two-stage Training for Knowledge-aware Question Answering

Ye, Qichen, Cao, Bowen, Chen, Nuo, Xu, Weiyuan, Zou, Yuexian

arXiv.org Artificial IntelligenceMar-15-2023

Knowledge-aware question answering (KAQA) requires the model to answer questions over a knowledge base, which is essential for both open-domain QA and domain-specific QA, especially when language models alone cannot provide all the knowledge needed. Despite the promising result of recent KAQA systems which tend to integrate linguistic knowledge from pre-trained language models (PLM) and factual knowledge from knowledge graphs (KG) to answer complex questions, a bottleneck exists in effectively fusing the representations from PLMs and KGs because of (i) the semantic and distributional gaps between them, and (ii) the difficulties in joint reasoning over the provided knowledge from both modalities. To address the above two problems, we propose a Fine-grained Two-stage training framework (FiTs) to boost the KAQA system performance: The first stage aims at aligning representations from the PLM and the KG, thus bridging the modality gaps between them, named knowledge adaptive post-training. The second stage, called knowledge-aware fine-tuning, aims to improve the model's joint reasoning ability based on the aligned representations. In detail, we fine-tune the post-trained model via two auxiliary self-supervised tasks in addition to the QA supervision. Extensive experiments demonstrate that our approach achieves state-of-the-art performance on three benchmarks in the commonsense reasoning (i.e., CommonsenseQA, OpenbookQA) and medical question answering (i.e., MedQA-USMILE) domains.

artificial intelligence, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2302.11799

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(5 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Autonomous Swarming AI Munitions for USAF

#artificialintelligenceSep-6-2022, 04:55:06 GMT

The US Air Force's AFLCMC Armament Directorate has awarded a one-year contract to Liteye Systems and Unmanned Experts to build Web Weasels (WW) autonomous swarming artificially intelligent munitions. WW is part of Unmanned Experts' parent program Air Commons – Swarm which allows commanders to plan, task, and manage multiple swarming assets through a Swarm ATO and Swarm Engine. According to Liteye, squadrons of autonomous collaborative munitions operating at range, and at risk, need the training, Tactics, Techniques and Procedures (TTPs) to handle the speed-of-datalink environment that occurs in modern combat. Teamwork, communication, shared mental models, and a robust set of tried and tested strategies are needed to survive and dominate. WW aims to overlay Artificial Intelligence and Machine Learning (AI/ML)-trained algorithms onto Air Commons – Swarm's capabilities to provide pre-launch munitions with a series of TTPs in a'Playbook' for a given mission set (i.e., SEAD).

autonomous swarming ai munition, munition, unmanned expert, (7 more...)

#artificialintelligence

Industry: Government > Military > Air Force (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.61)

Add feedback

End-to-End Weak Supervision

Cachay, Salva Rühling, Boecking, Benedikt, Dubrawski, Artur

arXiv.org Artificial IntelligenceJul-5-2021

Aggregating multiple sources of weak supervision (WS) can ease the data-labeling bottleneck prevalent in many machine learning applications, by replacing the tedious manual collection of ground truth labels. Current state of the art approaches that do not use any labeled training data, however, require two separate modeling steps: Learning a probabilistic latent variable model based on the WS sources -- making assumptions that rarely hold in practice -- followed by downstream model training. Importantly, the first step of modeling does not consider the performance of the downstream model. To address these caveats we propose an end-to-end approach for directly learning the downstream model by maximizing its agreement with probabilistic labels generated by reparameterizing previous probabilistic posteriors with a neural network. Our results show improved performance over prior work in terms of end model performance on downstream test sets, as well as in terms of improved robustness to dependencies among weak supervision sources.

dataset, experiment, weasel, (17 more...)

arXiv.org Artificial Intelligence

2107.02233

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Plotting time: On the usage of CNNs for time series classification

Rodrigues, Nuno M., Batista, João E., Trujillo, Leonardo, Duarte, Bernardo, Giacobini, Mario, Vanneschi, Leonardo, Silva, Sara

arXiv.org Artificial IntelligenceFeb-8-2021

We present a novel approach for time series classification where we represent time series data as plot images and feed them to a simple CNN, outperforming several state-of-the-art methods. We propose a simple and highly replicable way of plotting the time series, and feed these images as input to a non-optimized shallow CNN, without any normalization or residual connections. These representations are no more than default line plots using the time series data, where the only pre-processing applied is to reduce the number of white pixels in the image. We compare our method with different state-of-the-art methods specialized in time series classification on two real-world non public datasets, as well as 98 datasets of the UCR dataset collection. The results show that our approach is very promising, achieving the best results on both real-world datasets and matching / beating the best state-of-the-art methods in six UCR datasets. We argue that, if a simple naive design like ours can obtain such good results, it is worth further exploring the capabilities of using image representation of time series data, along with more powerful CNNs, for classification and other related tasks.

dataset, inceptiontime, time sery, (15 more...)

arXiv.org Artificial Intelligence

2102.04179

Country:

Europe > Portugal > Lisbon > Lisbon (0.14)
Europe > Italy > Piedmont > Turin Province > Turin (0.14)
North America > Mexico (0.04)
Asia > Japan (0.04)

Genre: Research Report > Promising Solution (0.94)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Sensing and Signal Processing > Image Processing (0.67)

Add feedback

Interpretable Time Series Classification using Linear Models and Multi-resolution Multi-domain Symbolic Representations

Nguyen, Thach Le, Gsponer, Severin, Ilie, Iulia, O'Reilly, Martin, Ifrim, Georgiana

arXiv.org Machine LearningMay-31-2020

The time series classification literature has expanded rapidly over the last decade, with many new classification approaches published each year. Prior research has mostly focused on improving the accuracy and efficiency of classifiers, with interpretability being somewhat neglected. This aspect of classifiers has become critical for many application domains and the introduction of the EU GDPR legislation in 2018 is likely to further emphasize the importance of interpretable learning algorithms. Currently, state-of-the-art classification accuracy is achieved with very complex models based on large ensembles (COTE) or deep neural networks (FCN). These approaches are not efficient with regard to either time or space, are difficult to interpret and cannot be applied to variable-length time series, requiring pre-processing of the original series to a set fixed-length. In this paper we propose new time series classification algorithms to address these gaps. Our approach is based on symbolic representations of time series, efficient sequence mining algorithms and linear classification models. Our linear models are as accurate as deep learning models but are more efficient regarding running time and memory, can work with variable-length time series and can be interpreted by highlighting the discriminative symbolic features on the original time series. We show that our multi-resolution multi-domain linear classifier (mtSS-SEQL+LR) achieves a similar accuracy to the state-of-the-art COTE ensemble, and to recent deep learning methods (FCN, ResNet), but uses a fraction of the time and memory required by either COTE or deep models. To further analyse the interpretability of our classifier, we present a case study on a human motion dataset collected by the authors. We release all the results, source code and data to encourage reproducibility.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Machine Learning

2006.01667

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > California > Riverside County > Riverside (0.04)
Europe > United Kingdom > England (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Workflow (0.93)

Industry:

Law (0.54)
Health & Medicine (0.46)
Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A tale of two toolkits, report the second: bake off redux. Chapter 1. dictionary based classifiers

Bagnall, Anthony, Large, James, Middlehurst, Matthew

arXiv.org Machine LearningNov-27-2019

Time series classification (TSC) is the problem of learning labels from time dependent data. One class of algorithms is derived from a bag of words approach. A window is run along a series, the subseries is shortened and discretised to form a word, then features are formed from the histogram of frequency of occurrence of words. We call this type of approach to TSC dictionary based classification. We compare four dictionary based algorithms in the context of a wider project to update the great time series classification bakeoff, a comparative study published in 2017. We experimentally characterise the algorithms in terms of predictive performance, time complexity and space complexity. We find that we can improve on the previous best in terms of accuracy, but this comes at the cost of time and space. Alternatively, the same performance can be achieved with far less cost. We review the relative merits of the four algorithms before suggesting a path to possible improvement.

algorithm, classifier, weasel, (13 more...)

arXiv.org Machine Learning

1911.12008

Country: Europe > United Kingdom > England (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Scalable Dictionary Classifiers for Time Series Classification

Middlehurst, Matthew, Vickers, William, Bagnall, Anthony

arXiv.org Machine LearningJul-26-2019

Dictionary based classifiers are a family of algorithms for time series classification (TSC), that focus on capturing the frequency of pattern occurrences in a time series. The ensemble based Bag of Symbolic Fourier Approximation Symbols (BOSS) was found to be a top performing TSC algorithm in a recent evaluation, as well as the best performing dictionary based classifier. A recent addition to the category, the Word Extraction for Time Series Classification (WEASEL), claims an improvement on this performance. Both of these algorithms however have non-trivial scalability issues, taking a considerable amount of build time and space on larger datasets. We evaluate changes to the way BOSS chooses classifiers for its ensemble, replacing its parameter search with random selection. This change allows for the easy implementation of contracting, setting a build time limit for the classifier and check-pointing, saving progress during the classifiers build. To differentiate between the two BOSS ensemble methods we refer to our randomised version as RBOSS. Additionally we test the application of common ensembling techniques to help retain accuracy from the loss of the BOSS parameter search. We achieve a significant reduction in build time without a significant change in accuracy on average when compared to BOSS by creating a size $n$ weighted ensemble selecting the best performers from $k$ randomly chosen parameter sets. Our experiments are conducted on datasets from the recently expanded UCR time series archive. We demonstrate the usability improvements to RBOSS with a case study using a large whale acoustics dataset for which BOSS proved infeasible.

artificial intelligence, classifier, machine learning, (17 more...)

arXiv.org Machine Learning

1907.11815

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback