AITopics | Representation Of Examples

Collaborating Authors

Representation Of Examples

News Overviews Instructional Materials AI-Alerts Classics

Automatic measurement of vowel duration via structured prediction

Adi, Yossi, Keshet, Joseph, Cibelli, Emily, Gustafson, Erin, Clopper, Cynthia, Goldrick, Matthew

arXiv.org Machine LearningOct-26-2016

A key barrier to making phonetic studies scalable and replicable is the need to rely on subjective, manual annotation. To help meet this challenge, a machine learning algorithm was developed for automatic measurement of a widely used phonetic measure: vowel duration. Manually-annotated data were used to train a model that takes as input an arbitrary length segment of the acoustic signal containing a single vowel that is preceded and followed by consonants and outputs the duration of the vowel. The model is based on the structured prediction framework. The input signal and a hypothesized set of a vowel's onset and offset are mapped to an abstract vector space by a set of acoustic feature functions. The learning algorithm is trained in this space to minimize the difference in expectations between predicted and manually-measured vowel durations. The trained model can then automatically estimate vowel durations without phonetic or orthographic transcription. Results comparing the model to three sets of manually annotated data suggest it out-performed the current gold standard for duration measurement, an HMM-based forced aligner (which requires orthographic or phonetic transcription as an input).

artificial intelligence, automatic measurement, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1121/1.4972527

1610.08166

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > Indiana (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

How computers might finally be able to identify sarcasm

#artificialintelligenceOct-14-2016, 12:35:27 GMT

Back in 1970, the social activist Irina Dunn scribbled a slogan on the back of a toilet cubicle door at the University of Sydney. It said: "A woman needs a man like a fish needs a bicycle." The phrase went viral and eventually became a famous refrain for the growing feminist movement of the time. The phrase is also an example of sarcasm. The humor comes from the fact that a fish doesn't need a bicycle.

artificial intelligence, machine learning, sarcasm, (9 more...)

#artificialintelligence

Country: Asia > India (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.43)

Add feedback

Lightweight Random Indexing for Polylingual Text Classification

Moreo Fernández, Alejandro, Esuli, Andrea, Sebastiani, Fabrizio

Journal of Artificial Intelligence ResearchOct-13-2016

Multilingual Text Classification (MLTC) is a text classification task in which documents are written each in one among a set L of natural languages, and in which all documents must be classified under the same classification scheme, irrespective of language. There are two main variants of MLTC, namely Cross-Lingual Text Classification (CLTC) and Polylingual Text Classification (PLTC). In PLTC, which is the focus of this paper, we assume (differently from CLTC) that for each language in L there is a representative set of training documents; PLTC consists of improving the accuracy of each of the |L| monolingual classifiers by also leveraging the training documents written in the other (|L| − 1) languages. The obvious solution, consisting of generating a single polylingual classifier from the juxtaposed monolingual vector spaces, is usually infeasible, since the dimensionality of the resulting vector space is roughly |L| times that of a monolingual one, and is thus often unmanageable. As a response, the use of machine translation tools or multilingual dictionaries has been proposed. However, these resources are not always available, or are not always free to use. One machine-translation-free and dictionary-free method that, to the best of our knowledge, has never been applied to PLTC before, is Random Indexing (RI). We analyse RI in terms of space and time efficiency, and propose a particular configuration of it (that we dub Lightweight Random Indexing LRI). By running experiments on two well known public benchmarks, Reuters RCV1/RCV2 (a comparable corpus) and JRC-Acquis (a parallel one), we show LRI to outperform (both in terms of effectiveness and efficiency) a number of previously proposed machine-translation-free and dictionary-free PLTC methods that we use as baselines.

lightweight random indexing, proceedings, representation, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5194

AI Access Foundation

11025

Journal of Artificial Intelligence Research

Country:

Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > Middle East > Jordan (0.04)
(18 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.55)

Add feedback

Decision Trees and Political Party Classification

#artificialintelligenceSep-29-2016, 08:00:26 GMT

Last time we investigated the k-nearest-neighbors algorithm and the underlying idea that one can learn a classification rule by copying the known classification of nearby data points. This required that we view our data as sitting inside a metric space; that is, we imposed a kind of geometric structure on our data. One glaring problem is that there may be no reasonable way to do this. While we mentioned scaling issues and provided a number of possible metrics in our primer, a more common problem is that the data simply isn't numeric. For instance, a poll of US citizens might ask the respondent to select which of a number of issues he cares most about. There could be 50 choices, and there is no reasonable way to assign these numerical values so that all are equidistant in the resulting metric space. Another issue is that the quality of the data could be bad. For instance, there may be missing values for some attributes (e.g., a respondent may neglect to answer one or more questions).

artificial intelligence, decision tree learning, machine learning, (17 more...)

#artificialintelligence

Country: North America > United States (0.14)

Industry: Government (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.54)

Add feedback

Linearly Independent Sets in Vector Spaces induced by Kernels • /r/MachineLearning

#artificialintelligenceJun-5-2016, 21:50:50 GMT

I hope this post is okay (if not let me know). I'm attaching a pdf which rigorously defines my question. Briefly, what I'm wondering is this - for the set of data points {x1,...,xp} in a vector space, (say, Rn) under what conditions is the set {k(x1,),...,k(xp,)} (where k(,) is a kernel function) independent? What conditions must the set {x1,...,xp} and the kernel function have to ensure independence? If there isn't an immediate answer to this question I'll happily take recommendations for mathematical reading towards trying to answer this question.

artificial intelligence, linearly independent set, machine learning, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.70)

Add feedback

How To Extract Feature Vectors From Deep Neural Networks In Python Caffe

#artificialintelligenceApr-26-2016, 19:35:24 GMT

Convolutional Neural Networks are great at identifying all the information that makes an image distinct. When we train a deep neural network in Caffe to classify images, we specify a multilayered neural network with different types of layers like convolution, rectified linear unit, softmax loss, and so on. The last layer is the output layer that gives us the output tag with the corresponding confidence value. But sometimes it's useful for us to extract the feature vectors from various layers and use it for other purposes. Let's see how to do it in Python Caffe, shall we?

artificial intelligence, feature vector, machine learning, (7 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.82)

Add feedback

Creating Images by Learning Image Semantics Using Vector Space Models

Heath, Derrall (Brigham Young University) | Ventura, Dan (Brigham Young University)

AAAI ConferencesApr-19-2016

When dealing with images and semantics, most computational systems attempt to automatically extract meaning from images. Here we attempt to go the other direction and autonomously create images that communicate concepts. We present an enhanced semantic model that is used to generate novel images that convey meaning. We employ a vector space model and a large corpus to learn vector representations of words and then train the semantic model to predict word vectors that could describe a given image. Once trained, the model autonomously guides the process of rendering images that convey particular concepts. A significant contribution is that, because of the semantic associations encoded in these word vectors, we can also render images that convey concepts on which the model was not explicitly trained. We evaluate the semantic model with an image clustering technique and demonstrate that the model is successful in creating images that communicate semantic relationships.

adjective, darci, vector, (17 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Alaska (0.04)
North America > United States > Utah > Utah County > Provo (0.04)
North America > United States > New York > New York County > New York City (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.73)
(2 more...)

Add feedback

A Generative Model of Words and Relationships from Multiple Sources

Hyland, Stephanie L. (Weill Cornell Graduate School of Medical Sciences/Memorial Sloan Kettering Cancer Center) | Karaletsos, Theofanis (Memorial Sloan Kettering Cancer Center) | Rätsch, Gunnar (Memorial Sloan Kettering Cancer Center)

AAAI ConferencesApr-19-2016

Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this requirement may not be met due to difficulties in obtaining a large corpus, or the limited range of expression in average use. Such domains may encode prior knowledge about entities in a knowledge base or ontology. We propose a generative model which integrates evidence from diverse data sources, enabling the sharing of semantic information. We achieve this by generalising the concept of co-occurrence from distributional semantics to include other relationships between entities or words, which we model as affine transformations on the embedding space. We demonstrate the effectiveness of this approach by outperforming recent models on a link prediction task and demonstrating its ability to profit from partially or fully unobserved data training labels. We further demonstrate the usefulness of learning from different data sources with overlapping vocabularies.

artificial intelligence, machine learning, natural language, (19 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country: Asia (0.28)

Genre: Research Report (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Hematology (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Who are alike? Use BigObject feature vector to find similarities

@machinelearnbotApr-12-2016, 08:20:23 GMT

Cluster Analysis is a common technique to group a set of objects in the way that the objects in the same group share certain attributes. It's commonly used in marketing and sales planning to define market segmentations. Here at BigObject we adopt a simple approach to exploring the similarities between objects. We simply calculate the "Feature Vector" based on given attributes and use the score to determine which objects are "alike." This is a simple example to show how to use BigObject to extract product features and then find similar products in your retail data.

data mining, machine learning, use bigobject feature vector, (7 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.99)
Information Technology > Data Science > Data Mining > Feature Extraction (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.66)

Add feedback

Maximum margin classifier working in a set of strings

Koyano, Hitoshi, Hayashida, Morihiro, Akutsu, Tatsuya

arXiv.org Machine LearningFeb-22-2016

Numbers and numerical vectors account for a large portion of data. However, recently the amount of string data generated has increased dramatically. Consequently, classifying string data is a common problem in many fields. The most widely used approach to this problem is to convert strings into numerical vectors using string kernels and subsequently apply a support vector machine that works in a numerical vector space. However, this non-one-to-one conversion involves a loss of information and makes it impossible to evaluate, using probability theory, the generalization error of a learning machine, considering that the given data to train and test the machine are strings generated according to probability laws. In this study, we approach this classification problem by constructing a classifier that works in a set of strings. To evaluate the generalization error of such a classifier theoretically, probability theory for strings is required. Therefore, we first extend a limit theorem on the asymptotic behavior of a consensus sequence of strings, which is the counterpart of the mean of numerical vectors, as demonstrated in the probability theory on a metric space of strings developed by one of the authors and his colleague in a previous study [18]. Using the obtained result, we then demonstrate that our learning machine classifies strings in an asymptotically optimal manner. Furthermore, we demonstrate the usefulness of our machine in practical data analysis by applying it to predicting protein--protein interactions using amino acid sequences.

artificial intelligence, machine learning, maximum margin classifier

arXiv.org Machine Learning

1406.0597

Genre: Research Report (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.53)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)

Add feedback