unsupervised algorithm
Extracting Abstraction Dimensions by Identifying Syntax Pattern from Texts
Zhou, Jian, Li, Jiazheng, Zhuge, Sirui, Zhuge, Hai
This paper proposed an approach to automatically discovering subject dimension, action dimension, object dimension and adverbial dimension from texts to efficiently operate texts and support query in natural language. The high quality of trees guarantees that all subjects, actions, objects and adverbials and their subclass relations within texts can be represented. The independency of trees ensures that there is no redundant representation between trees. The expressiveness of trees ensures that the majority of sentences can be accessed from each tree and the rest of sentences can be accessed from at least one tree so that the tree-based search mechanism can support querying in natural language. Experiments show that the average precision, recall and F1-score of the abstraction trees constructed by the subclass relations of subject, action, object and adverbial are all greater than 80%. The application of the proposed approach to supporting query in natural language demonstrates that different types of question patterns for querying subject or object have high coverage of texts, and searching multiple trees on subject, action, object and adverbial according to the question pattern can quickly reduce search space to locate target sentences, which can support precise operation on texts.
- Oceania > Australia (0.14)
- Asia > China > Beijing > Beijing (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
- Research Report (1.00)
- Personal > Honors (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
- (2 more...)
Outlier detection using flexible categorisation and interrogative agendas
Boersma, Marcel, Manoorkar, Krishna, Palmigiano, Alessandra, Panettiere, Mattia, Tzimoulis, Apostolos, Wijnberg, Nachoem
Categorization is one of the basic tasks in machine learning and data analysis. Building on formal concept analysis (FCA), the starting point of the present work is that different ways to categorize a given set of objects exist, which depend on the choice of the sets of features used to classify them, and different such sets of features may yield better or worse categorizations, relative to the task at hand. In their turn, the (a priori) choice of a particular set of features over another might be subjective and express a certain epistemic stance (e.g. interests, relevance, preferences) of an agent or a group of agents, namely, their interrogative agenda. In the present paper, we represent interrogative agendas as sets of features, and explore and compare different ways to categorize objects w.r.t. different sets of features (agendas). We first develop a simple unsupervised FCA-based algorithm for outlier detection which uses categorizations arising from different agendas. We then present a supervised meta-learning algorithm to learn suitable (fuzzy) agendas for categorization as sets of features with different weights or masses. We combine this meta-learning algorithm with the unsupervised outlier detection algorithm to obtain a supervised outlier detection algorithm. We show that these algorithms perform at par with commonly used algorithms for outlier detection on commonly used datasets in outlier detection. These algorithms provide both local and global explanations of their results.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Africa > South Africa > Gauteng > Johannesburg (0.04)
- North America > United States > Ohio > Summit County > Akron (0.04)
- (2 more...)
Unsupervised Algorithms in Machine Learning
One of the most useful areas in machine learning is discovering hidden patterns from unlabeled data. Add the fundamentals of this in-demand skill to your Data Science toolkit. In this course, we will learn selected unsupervised learning methods for dimensionality reduction, clustering, and learning latent features. We will also focus on real-world applications such as recommender systems with hands-on examples of product recommendation algorithms. Prior coding or scripting knowledge is required.
- Education > Educational Technology > Educational Software > Computer Based Training (0.49)
- Education > Educational Setting > Online (0.49)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.62)
- Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.62)
- Information Technology > Enterprise Applications > Human Resources > Learning Management (0.49)
Automatic Translating between Ancient Chinese and Contemporary Chinese with Limited Aligned Corpora
Zhang, Zhiyuan, Li, Wei, Su, Qi
The Chinese language has evolved a lot during the long-term development. Therefore, native speakers now have trouble in reading sentences written in ancient Chinese. In this paper, we propose to build an end-to-end neural model to automatically translate between ancient and contemporary Chinese. However, the existing ancient-contemporary Chinese parallel corpora are not aligned at the sentence level and sentence-aligned corpora are limited, which makes it difficult to train the model. To build the sentence level parallel training data for the model, we propose an unsupervised algorithm that constructs sentence-aligned ancient-contemporary pairs by using the fact that the aligned sentence pair shares many of the tokens. Based on the aligned corpus, we propose an end-to-end neural model with copying mechanism and local attention to translate between ancient and contemporary Chinese. Experiments show that the proposed unsupervised algorithm achieves 99.4% F1 score for sentence alignment, and the translation model achieves 26.95 BLEU from ancient to contemporary, and 36.34 BLEU from contemporary to ancient.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (3 more...)
The Reasons To Regulate AI Algorithms Are Simpler Than You Think
Do you worry artificial intelligence will take over the world? From Elon Musk worrying about DeepMind beating humans in the advanced game of Go in 2017, to members of Congress, European policy makers (see A European approach to artificial intelligence), and academics, there's this feeling that this is the decade to take AI seriously, and it is taking hold. Though, not for the reasons you might think and not due to any present threat. This is where algorithms come in. What is an algorithm, you may ask?
- North America > United States > North Carolina > Buncombe County > Asheville (0.14)
- Europe > Norway (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
- (4 more...)
Chakraborty
In this paper, we focus on the problem of extracting structured labeled data from short unstructured ad-postings from online sources like Craigslist, where ads are posted on various topics, such as job postings, rentals, car sales etc. A fundamental challenge in addressing this problem is that most ad-postings are highly unstructured, short-text postings written in an informal manner with no inherent grammar or well-defined dictionary. In this paper, we propose unsupervised and supervised algorithms for extracting structured data from unstructured ads in the form of (key, value) pairs where the keys naturally represent topic-specific features in the ads. The unsupervised algorithm is centered around building an affinity graph, using the words from a topic-specific corpus of such ads where the edge weights represent affinities between words; the (key, value) extraction algorithm identifies specific groups of words in the affinity graph corresponding to different classes of key attributes. The supervised algorithm uses a Conditional Random Field based training algorithm to identify specific structured (key, value) pairs based on pre-defined topic-specific structural data representations of ads. Based on a corpus of car and apartment ad-postings from Craigslist, the unsupervised algorithm reported an accuracy of 67.74% and 68.74% for car and apartment ads respectively. The supervised algorithm demonstrated an improved performance with accuracies of 74.07%
Machine Learning Algorithm Revolutionizes How Scientists Study Behavior - Neuroscience News
Summary: A new AI algorithm can independently discover and categorize an animal's behavior by analyzing patterns of body movements. To Eric Yttri, assistant professor of biological sciences and Neuroscience Institute faculty at Carnegie Mellon University, the best way to understand the brain is to watch how organisms interact with the world. "Behavior drives everything we do," Yttri said. As a behavioral neuroscientist, Yttri studies what happens in the brain when animals walk, eat, sniff or do any action. This kind of research could help answer questions about neurological diseases or disorders like Parkinson's disease or stroke.
5 Forecasts About the Future of Machine Learning – ReadWrite
Machine learning is a revolutionary technology that currently forms a critical aspect of numerous burgeoning and established industries. This technology allows computers to access hidden insights and predict outcomes, leading to remarkable changes to businesses. Wei Lei, who is the Vice President and General Manager at Intel, says that "machine learning is becoming more sophisticated with every passing year. And, we are yet to see its full potential--beyond self-driving cars, fraud detection devices, or retail trends analyses." How will it impact our world in the future?
- Retail (0.91)
- Information Technology (0.56)
Oracle bets on supervised machine learning for cybersecurity edge
Oracle are poised to help customers make the leap from on premise, up into the cloud. At Infosecurity Europe 2017, you couldn't escape the buzz of what will may soon become the next great battle on the cybersecurity frontier – automation. However, the belief that automation is a cyber security silver bullet is not one that is well believed it seems, with Oracle's Rohit Gupta telling CBR that not all situations can be solely monitored by technology. "There are certain conditions where automation will never be accepted. As an example, let's say the system discovers something suspicious going on in an executive's credentials, the CFO's credentials. You don't want to turn off the access between the CFO and his or her system, that could be career suicide, you never want that to happen. "In that scenario you have policy in place that says for these specific types of roles, or these specific types of entitlements, I want to have human intervention – somebody to look over this detected issue ...
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.61)
Why AI Will Be the Name of the Game in 2017 - The Market Mogul
The development of Artificial Intelligence (AI) has been remarkable and, in recent years, has experienced several breakthroughs. Will 2017 continue this trend? Commercial enterprises are procuring and developing systems that use AI to aid in their corporate goals, enhancing their performance through the use of autonomy. By exploiting advancements in big data analytics, processing power and clearer computer systems and networks, companies can use automation and AI to augment their current work processes and generate methods of long-term value creation. The stage is set, however, for AI to rise higher than ever before as large players including Apple, Facebook, Google and Microsoft all open-source or share their latest research in AI, to advance collective advancement.
- Asia > China (0.16)
- North America > United States (0.15)
- Information Technology > Communications > Social Media (0.73)
- Information Technology > Artificial Intelligence > Natural Language (0.70)
- Information Technology > Data Science > Data Mining > Big Data (0.56)