AITopics | Supervised Learning

Collaborating Authors

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Generalized Zero-shot Intent Detection via Commonsense Knowledge

Siddique, A. B., Jamour, Fuad, Xu, Luxun, Hristidis, Vagelis

arXiv.org Artificial IntelligenceFeb-4-2021

Identifying user intents from natural language utterances is a crucial step in conversational systems that has been extensively studied as a supervised classification problem. However, in practice, new intents emerge after deploying an intent detection model. Thus, these models should seamlessly adapt and classify utterances with both seen and unseen intents -- unseen intents emerge after deployment and they do not have training data. The few existing models that target this setting rely heavily on the scarcely available training data and overfit to seen intents data, resulting in a bias to misclassify utterances with unseen intents into seen ones. We propose RIDE: an intent detection model that leverages commonsense knowledge in an unsupervised fashion to overcome the issue of training data scarcity. RIDE computes robust and generalizable relationship meta-features that capture deep semantic relationships between utterances and intent labels; these features are computed by considering how the concepts in an utterance are linked to those in an intent label via commonsense knowledge. Our extensive experimental analysis on three widely-used intent detection benchmarks shows that relationship meta-features significantly increase the accuracy of detecting both seen and unseen intents and that RIDE outperforms the state-of-the-art model for unseen intents.

dataset, unseen intent, utterance, (11 more...)

arXiv.org Artificial Intelligence

2102.02925

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Riverside County > Riverside (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.90)
(2 more...)

Add feedback

Time Series Classification via Topological Data Analysis

Karan, Alperen, Kaygun, Atabey

arXiv.org Machine LearningFeb-3-2021

In this study, we use persistent homology to perform classification tasks on two publicly available multivariate time series datasets [19, 11] that include physiological data collected during stressful and non stressful tasks. Instead of directly computing signal-specific features from sliding windows and subwindows on modalities such as electrocardiogram and wrist temperature (Figure 7), we extracted features using persistence diagrams and their statistical properties. Subwindowing method allowed us to reduce noise without incurring an extra computational cost. We then developed machine learning models and assess the performance of our models by varying window sizes and using different flavors of persistence diagrams. Topological Data Analysis (TDA) techniques usually work with points embedded in an affine space of large enough dimension. However, TDA techniques can still be applied to time series data sets whether they are univariate or multivariate. One can convert a univariate time series into a finite collection of points in a -dimensional affine space using delay embedding methods, of which one can compute persistent homology. Since Taken's Theorem implies that the delay embeddings produces topologically invariant subsets on a non-chaotical dynamical system [21], one can reasonably expect that persistent homology produces features that would distinguish different time series. There is a handful of research on the persistent homology of delay embeddings for time series classification [23, 20, 1].

accuracy, persistence diagram, persistent homology, (13 more...)

arXiv.org Machine Learning

2102.01956

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.89)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.48)
Health & Medicine > Diagnostic Medicine (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.46)

Add feedback

Fast rates in structured prediction

Cabannes, Vivien, Rudi, Alessandro, Bach, Francis

arXiv.org Artificial IntelligenceFeb-1-2021

Discrete supervised learning problems such as classification are often tackled by introducing a continuous surrogate problem akin to regression. Bounding the original error, between estimate and solution, by the surrogate error endows discrete problems with convergence rates already shown for continuous instances. Yet, current approaches do not leverage the fact that discrete problems are essentially predicting a discrete output when continuous problems are predicting a continuous value. In this paper, we tackle this issue for general structured prediction problems, opening the way to "super fast" rates, that is, convergence rates for the excess risk faster than $n^{-1}$, where $n$ is the number of observations, with even exponential rates with the strongest assumptions. We first illustrate it for predictors based on nearest neighbors, generalizing rates known for binary classification to any discrete problem within the framework of structured prediction. We then consider kernel ridge regression where we improve known rates in $n^{-1/4}$ to arbitrarily fast rates, depending on a parameter characterizing the hardness of the problem, thus allowing, under smoothness assumptions, to bypass the curse of dimensionality.

assumption, classification, inequality, (15 more...)

arXiv.org Artificial Intelligence

2102.0076

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York (0.04)
North America > United States > Texas (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.81)

Add feedback

Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

Bressan, Marco, Cesa-Bianchi, Nicolò, Lattanzi, Silvio, Paudice, Andrea

arXiv.org Machine LearningJan-31-2021

We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural characterization of clusters, called $(\beta,\gamma)$-convexity, that can be applied to any finite set of points equipped with a metric (or even a semimetric, as the triangle inequality is not needed). Using $(\beta,\gamma)$-convexity, we can translate natural density properties of clusters (which include, for instance, clusters that are strongly non-convex in $R^d$) into a graph-theoretic notion of convexity. By exploiting this convexity notion, we design a deterministic algorithm that recovers $(\beta,\gamma)$-convex clusters using $O(k^2 \log n + k^2 (\frac{6}{\beta\gamma})^{dens(X)})$ same-cluster queries, where $k$ is the number of clusters and $dens(X)$ is the density dimension of the semimetric. We show that an exponential dependence on the density dimension is necessary, and we also show that, if we are allowed to make $O(k^2 + k \log n)$ additional queries to a "cluster separation" oracle, then we can recover clusters that have different and arbitrary scales, even when the scale of each cluster is unknown.

exact recovery, finite metric space, oracle query

arXiv.org Machine Learning

2102.00504

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.40)

Add feedback

Randomized Deep Structured Prediction for Discourse-Level Processing

Widmoser, Manuel, Pacheco, Maria Leonor, Honorio, Jean, Goldwasser, Dan

arXiv.org Artificial IntelligenceJan-25-2021

Expressive text encoders such as RNNs and Transformer Networks have been at the center of NLP models in recent work. Most of the effort has focused on sentence-level tasks, capturing the dependencies between words in a single sentence, or pairs of sentences. However, certain tasks, such as argumentation mining, require accounting for longer texts and complicated structural dependencies between them. Deep structured prediction is a general framework to combine the complementary strengths of expressive neural encoders and structured inference for highly structured domains. Nevertheless, when the need arises to go beyond sentences, most work relies on combining the output scores of independently trained classifiers. One of the main reasons for this is that constrained inference comes at a high computational cost. In this paper, we explore the use of randomized inference to alleviate this concern and show that we can efficiently leverage deep structured prediction and expressive neural encoders for a set of tasks involving complicated argumentative structures.

constraint, inference, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2101.10435

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(20 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.69)
(2 more...)

Add feedback

NEMR: Network Embedding on Metric of Relation

Xie, Luodi, Shen, Hong, Ren, Jiaxin

arXiv.org Artificial IntelligenceJan-20-2021

Network embedding maps the nodes of a given network into a low-dimensional space such that the semantic similarities among the nodes can be effectively inferred. Most existing approaches use inner-product of node embedding to measure the similarity between nodes leading to the fact that they lack the capacity to capture complex relationships among nodes. Besides, they take the path in the network just as structural auxiliary information when inferring node embeddings, while paths in the network are formed with rich user informations which are semantically relevant and cannot be ignored. In this paper, We propose a novel method called Network Embedding on the Metric of Relation, abbreviated as NEMR, which can learn the embeddings of nodes in a relational metric space efficiently. First, our NEMR models the relationships among nodes in a metric space with deep learning methods including variational inference that maps the relationship of nodes to a gaussian distribution so as to capture the uncertainties. Secondly, our NEMR considers not only the equivalence of multiple-paths but also the natural order of a single-path when inferring embeddings of nodes, which makes NEMR can capture the multiple relationships among nodes since multiple paths contain rich user information, e.g., age, hobby and profession. Experimental results on several public datasets show that the NEMR outperforms the state-of-the-art methods on relevant inference tasks including link prediction and node classification.

metric space, node, representation, (15 more...)

arXiv.org Artificial Intelligence

2101.0802

Country:

Asia > Middle East > Lebanon (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.46)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.57)

Add feedback

A Survey on the Explainability of Supervised Machine Learning

Burkart, Nadia (Fraunhofer IOSB) | Huber, Marco F. (Fraunhofer IPA, University of Stuttgart)

Journal of Artificial Intelligence ResearchJan-19-2021

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or finance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.

arxiv preprint arxiv, explanation, prediction, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12228

AI Access Foundation

12228

Journal of Artificial Intelligence Research

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Portugal > Lisbon > Lisbon (0.04)
North America > United States > New York > New York County > New York City (0.04)
(7 more...)

Genre: Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
(9 more...)

Add feedback

Studying Catastrophic Forgetting in Neural Ranking Models

Lovon-Melgarejo, Jesus, Soulier, Laure, Pinel-Sauvagnat, Karen, Tamine, Lynda

arXiv.org Artificial IntelligenceJan-18-2021

Several deep neural ranking models have been proposed in the recent IR literature. While their transferability to one target domain held by a dataset has been widely addressed using traditional domain adaptation strategies, the question of their cross-domain transferability is still under-studied. We study here in what extent neural ranking models catastrophically forget old knowledge acquired from previously observed domains after acquiring new knowledge, leading to performance decrease on those domains. Our experiments show that the effectiveness of neuralIR ranking models is achieved at the cost of catastrophic forgetting and that a lifelong learning strategy using a cross-domain regularizer success-fully mitigates the problem. Using an explanatory approach built on a regression model, we also show the effect of domain characteristics on the rise of catastrophic forgetting. We believe that the obtained results can be useful for both theoretical and practical future work in neural IR.

dataset, neural ranking model, ranking model, (15 more...)

arXiv.org Artificial Intelligence

2101.06984

Country:

North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report (1.00)

Industry:

Education (0.91)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
(2 more...)

Add feedback

ExpFinder: An Ensemble Expert Finding Model Integrating $N$-gram Vector Space Model and $\mu$CO-HITS

Kang, Yong-Bin, Du, Hung, Forkan, Abdur Rahim Mohammad, Jayaraman, Prem Prakash, Aryani, Amir, Sellis, Timos

arXiv.org Artificial IntelligenceJan-17-2021

Finding an expert plays a crucial role in driving successful collaborations and speeding up high-quality research development and innovations. However, the rapid growth of scientific publications and digital expertise data makes identifying the right experts a challenging problem. Existing approaches for finding experts given a topic can be categorised into information retrieval techniques based on vector space models, document language models, and graph-based models. In this paper, we propose $\textit{ExpFinder}$, a new ensemble model for expert finding, that integrates a novel $N$-gram vector space model, denoted as $n$VSM, and a graph-based model, denoted as $\textit{$\mu$CO-HITS}$, that is a proposed variation of the CO-HITS algorithm. The key of $n$VSM is to exploit recent inverse document frequency weighting method for $N$-gram words and $\textit{ExpFinder}$ incorporates $n$VSM into $\textit{$\mu$CO-HITS}$ to achieve expert finding. We comprehensively evaluate $\textit{ExpFinder}$ on four different datasets from the academic domains in comparison with six different expert finding models. The evaluation results show that $\textit{ExpFinder}$ is a highly effective model for expert finding, substantially outperforming all the compared models in 19% to 160.2%.

co-hit, expfinder, nvsm, (16 more...)

arXiv.org Artificial Intelligence

2101.06821

Country:

Oceania > Australia (0.05)
South America > Brazil (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.81)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.68)

Add feedback

SHARKS: Smart Hacking Approaches for RisK Scanning in Internet-of-Things and Cyber-Physical Systems based on Machine Learning

Saha, Tanujay, Aaraj, Najwa, Ajjarapu, Neel, Jha, Niraj K.

arXiv.org Artificial IntelligenceJan-7-2021

Cyber-physical systems (CPS) and Internet-of-Things (IoT) devices are increasingly being deployed across multiple functionalities, ranging from healthcare devices and wearables to critical infrastructures, e.g., nuclear power plants, autonomous vehicles, smart cities, and smart homes. These devices are inherently not secure across their comprehensive software, hardware, and network stacks, thus presenting a large attack surface that can be exploited by hackers. In this article, we present an innovative technique for detecting unknown system vulnerabilities, managing these vulnerabilities, and improving incident response when such vulnerabilities are exploited. The novelty of this approach lies in extracting intelligence from known real-world CPS/IoT attacks, representing them in the form of regular expressions, and employing machine learning (ML) techniques on this ensemble of regular expressions to generate new attack vectors and security vulnerabilities. Our results show that 10 new attack vectors and 122 new vulnerability exploits can be successfully generated that have the potential to exploit a CPS or an IoT ecosystem. The ML methodology achieves an accuracy of 97.4% and enables us to predict these attacks efficiently with an 87.2% reduction in the search space. We demonstrate the application of our method to the hacking of the in-vehicle network of a connected car. To defend against the known attacks and possible novel exploits, we discuss a defense-in-depth mechanism for various classes of attacks and the classification of data targeted by such attacks. This defense mechanism optimizes the cost of security measures based on the sensitivity of the protected resource, thus incentivizing its adoption in real-world CPS/IoT by cybersecurity practitioners.

attack dag, node, vulnerability, (12 more...)

arXiv.org Artificial Intelligence

2101.0278

Country:

Asia > India > West Bengal > Kharagpur (0.04)
Asia > Middle East > UAE (0.04)
Oceania > Australia (0.04)
(11 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Energy > Power Industry > Utilities > Nuclear (0.68)
Government > Military > Cyberwarfare (0.66)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Internet of Things (1.00)
Information Technology > Communications > Networks (1.00)
(7 more...)

Add feedback