Goto

Collaborating Authors

 Discourse & Dialogue


BAFI 2018 : Business Analytics in Finance and Industry

#artificialintelligence

Conference Topics Topics at this conference include, but are not limited to: Business Analytics - Methods: Dimensionality Reduction, Feature Extraction, and Feature Selection Supervised, Semi-Supervised, and Unsupervised Methods Statistical Learning Theory Online Learning, Data Stream Mining, and Dynamic Data Mining Graph Mining and Semi-Structured Data patial and Temporal Data Mining Deep Learning and Neural Network Research Large Scale Data Mining Uncertainty Modeling in Data Mining Business Analytics - Applications: Credit Scoring and Financial Modeling Forecasting Fraud Detection Web Intelligence and Information Retrieval Marketing, Business Intelligence, and e-Commerce Decision Analysis and Decision Support Systems Social Network Analysis Privacy-preserving Data Mining and Privacy-related Issue Text Mining, Sentiment Analysis, and Opinion Mining Important Dates July 31, 2017: Deadline for submission of extended abstracts August 15, 2017: Accept/reject decision November 15, 2017: Deadline for early registration January 17-19, 2018: BAFI 2018 *Only one contributed abstract is accepted from the same presenting author. Submission Guidelines Authors are requested to submit a 600 word abstract in English using the platform available at the EasyChair system. Please do not attach any additional files at this time.


Topic supervised non-negative matrix factorization

arXiv.org Machine Learning

Topic models have been extensively used to organize and interpret the contents of large, unstructured corpora of text documents. Although topic models often perform well on traditional training vs. test set evaluations, it is often the case that the results of a topic model do not align with human interpretation. This interpretability fallacy is largely due to the unsupervised nature of topic models, which prohibits any user guidance on the results of a model. In this paper, we introduce a semi-supervised method called topic supervised non-negative matrix factorization (TS-NMF) that enables the user to provide labeled example documents to promote the discovery of more meaningful semantic structure of a corpus. In this way, the results of TS-NMF better match the intuition and desired labeling of the user. The core of TS-NMF relies on solving a non-convex optimization problem for which we derive an iterative algorithm that is shown to be monotonic and convergent to a local optimum. We demonstrate the practical utility of TS-NMF on the Reuters and PubMed corpora, and find that TS-NMF is especially useful for conceptual or broad topics, where topic key terms are not well understood. Although identifying an optimal latent structure for the data is not a primary objective of the proposed approach, we find that TS-NMF achieves higher weighted Jaccard similarity scores than the contemporary methods, (unsupervised) NMF and latent Dirichlet allocation, at supervision rates as low as 10% to 20%.


AI, Machine Learning and Sentiment Analysis Applied to Finance – Millennium Hotel London Mayfair

#artificialintelligence

Artificial Intelligence, Machine Learning and Sentiment Analysis are changing the way in which numerous client services are offered. In particular, Financial Organisations are creating and leveraging such innovation in the domain of wealth management. This trend is now being taken on board by multiple innovators: academia, start-ups, technology companies and financial market participants. AI and Machine Learning have emerged as a central aspect of analytics which is applied to multiple domains. AI and Machine Learning, Pattern classifiers and natural language processing (NLP) underpin Sentiment Analysis (SA); SA is a technology that makes rapid assessment of the sentiments expressed in news releases as well as other media sources such as Twitter and blogs.


Salesforce How to Implement Sentiment Analysis in Salesforce – A Part of Artificial Intelligence Forcetalks

#artificialintelligence

Sentiment analysis is extremely useful us to gain an overview of the public opinion behind certain topics and feedbacks. Automatically classifying text by sentiment allows you to easily find out the general opinions of people in your area of interest. For example, you might want to analyze reviews of a product to help you improve the customer experience, or to find the most or least popular product. The Obama used sentiment analysis to gauge public opinion to policy announcements and campaign messages ahead of 2012 presidential election. How can we get this?


Real-time Twitter sentiment analysis with Azure Stream Analytics

#artificialintelligence

Learn how to build a sentiment analysis solution for social media analytics by bringing real-time Twitter events into Azure Event Hubs. In this scenario, you write an Azure Stream Analytics query to analyze the data. Then you either store the results for later use or use a dashboard and Power BI to provide insights in real time. Social media analytics tools help organizations understand trending topics. Trending topics are subjects and attitudes that have a high volume of posts in social media.


Polya Urn Latent Dirichlet Allocation: a doubly sparse massively parallel sampler

arXiv.org Machine Learning

Latent Dirichlet Allocation (LDA) is a topic model widely used in natural language processing and machine learning. Most approaches to training the model rely on iterative algorithms, which makes it difficult to run LDA on big data sets that are best analyzed in parallel and distributed computational environments. Indeed, current approaches to parallel inference either don't converge to the correct posterior or require storage of large dense matrices in memory. We present a novel sampler that overcomes both problems, and we show that this sampler is faster, both empirically and theoretically, than previous Gibbs samplers for LDA. We do so by employing a novel Pólya-Urn-based approximation in the sparse partially collapsed sampler for LDA. We prove that the approximation error vanishes with data size, making our algorithm asymptotically exact, a property of importance for large-scale topic models. In addition, we show, via an explicit example, that - contrary to popular belief in the topic modeling literature - partially collapsed samplers can be more efficient than fully collapsed samplers. We conclude by comparing the performance of our algorithm with that of other approaches on well-known corpora. Keywords: Bayesian inference, Big Data, computational complexity, Gibbs sampling, Latent Dirichlet Allocation, Markov Chain Monte Carlo, natural language processing, parallel and distributed systems, topic models.


Sentiment Analysis of Movie Reviews (2): word2vec

@machinelearnbot

This is the continuation of my mini-series on sentiment analysis of movie reviews, which originally appeared on recurrentnull.wordpress.com. Last time, we had a look at how well classical bag-of-words models worked for classification of the Stanford collection of IMDB reviews. As it turned out, the "winner" was Logistic Regression, using both unigrams and bigrams for classification. The best classification accuracy obtained was .89 So, bag-of-words models may be surprisingly successful, but they are limited in what they can do.


Computing Web-scale Topic Models using an Asynchronous Parameter Server

arXiv.org Machine Learning

Topic models such as Latent Dirichlet Allocation (LDA) have been widely used in information retrieval for tasks ranging from smoothing and feedback methods to tools for exploratory search and discovery. However, classical methods for inferring topic models do not scale up to the massive size of today's publicly available Web-scale data sets. The state-of-the-art approaches rely on custom strategies, implementations and hardware to facilitate their asynchronous, communication-intensive workloads. We present APS-LDA, which integrates state-of-the-art topic modeling with cluster computing frameworks such as Spark using a novel asynchronous parameter server. Advantages of this integration include convenient usage of existing data processing pipelines and eliminating the need for disk writes as data can be kept in memory from start to finish. Our goal is not to outperform highly customized implementations, but to propose a general high-performance topic modeling framework that can easily be used in today's data processing pipelines. We compare APS-LDA to the existing Spark LDA implementations and show that our system can, on a 480-core cluster, process up to 135 times more data and 10 times more topics without sacrificing model quality.


Opinion Mining - Sentiment Analysis and Beyond

@machinelearnbot

So you report with reasonable accuracies what the sentiment about a particular brand or product is. After publishing this report, your client comes back to you and says "Hey this is good. Now can you tell me ways in which I can convert the negative sentiments into positive sentiments?" – Sentiment Analysis stops there and we enter the realms of Opinion Mining. Opinion Mining is about having a deeper understanding of the review that was written. Typically, a detailed review will not just have a sentiment attached to it. It will have information and valuable feedback that can literally help to build the next strategy.


Brand-Value Analysis with simple Sentiment Analysis using Shiny / R

@machinelearnbot

This shinyapp is a live shiny/R web application (hosted on shinyapps.io) The web-application visualizes simple dictionary/word-count based sentiment-analysis scores for tweets (during Mar 17th - April 4th 2014) on smartphones in India in a few different ways. The shiny application can be found up and running here.