AITopics

1310.4849

Country:

Europe (0.92)
North America > United States (0.27)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.68)
(3 more...)

arXiv.org Machine LearningMar-3-2015

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

Kim, Been, Rudin, Cynthia, Shah, Julie

We present the Bayesian Case Model (BCM), a general framework for Bayesian case-based reasoning (CBR) and prototype classification and clustering. BCM brings the intuitive power of CBR to a Bayesian generative framework. The BCM learns prototypes, the "quintessential" observations that best represent clusters in a dataset, by performing joint inference on cluster labels, prototypes and important features. Simultaneously, BCM pursues sparsity by learning subspaces, the sets of features that play important roles in the characterization of the prototypes. The prototype and subspace representation provides quantitative benefits in interpretability while preserving classification accuracy. Human subject experiments verify statistically significant improvements to participants' understanding when using explanations produced by BCM, compared to those given by prior art.

artificial intelligence, machine learning, prototype, (18 more...)

1503.01161

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Film (0.69)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

A Noise Scaled Semi Parametric Gaussian Process Model for Real Time Water Network Leak Detection in the Presence of Heteroscedasticity

Malik, Obaid (University of Southampton) | Ghosh, Siddhartha (University of Southampton) | Rogers, Alex (University of Southampton)

The timely detection of leaks in water distribution systems is critical to the sustainable provision of clean water to consumers. Increasingly, water companies are deploying remote sensors to measure water flow in real-time in order to detect such leaks. However, in practice, for typical District Metering Zones (DMZ), financial constraints limit the number of deployable real time flow sensors/meters to one or two, thus constraining leak detection to be based on the aggregated flow being monitored at these point. Such aggregated flow data typically exhibits input signal dependence whereby both noise and leaks are dependent on the flow being measured. This limited monitoring and input signal dependance make conventional approaches based on simple thresholds unreliable for real time leak detection. To address this, we propose a Gaussian process (GP) model with an additive diagonal noise covariance that is able to handle the input dependant noise observed in this setting. A parameterised mean step change function is used to detect leaks and to estimate their size. Using prior water distribution systems (WDS) knowledge we dynamically bound and discretize the detection parameters of the step change mean function, reducing and pruning the parameter search space considerably. We evaluate the proposed noise scaled GP (NSGP) against both the latest researchwork on GP based fault detection methods and the current state of the art and applied leak detection approaches in water distribution systems. We show that our proposed method outperforms other approaches, on real water network data with synthetically generatedvtime varying leaks, with a detection accuracy of 99%, almost zero false positive detections and the lowest root mean squared error in leak magnitude estimation (0.065 l/s).

leak, survey article, upstream oil & gas, (20 more...)

Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: Europe > United Kingdom (0.14)

Industry:

Water & Waste Management > Water Management (1.00)
Energy > Oil & Gas > Upstream (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.55)

Discovering Hotspots and Coldspots of Species Richness in eBird Data

Moore, Travis (Oregon State University) | Wong, Weng-Keen (Oregon State University)

Quantifying biodiversity is an important task related to ecological research. One way to measure biodiversity is through species richness, which measures the number of unique species found in an area. Recently, citizen science biodiversity datasets such as eBird allow the calculation of species richness over an unprecedented spatial and temporal extent. However, several confounding factors associated with the unstructured observation process, such as observer effort, affect the number of species reported by citizen scientists. In this work, we develop an algorithm for discovering hotspots and coldspots of species richness using eBird data while accounting for these confounding factors.

artificial intelligence, checklist, machine learning, (14 more...)

Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Oregon (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

What Predicts Media Coverage of Health Science Articles?

Wallace, Byron C. (University of Texas at Austin) | Paul, Michael J. (Johns Hopkins University) | Elhadad, Noémie (Columbia University)

An important aspect of health science is communicating research findings to the public. The media is a critical instrument in disseminating research. Yet the process by which a scientific article becomes “newsworthy” is not well understood. In this study, we use large-scale text analysis to characterize the content features of articles that are predictive of newsworthiness. We experiment with two novel corpora: (i) 28,910 articles from a di- verse range of biomedical and health journals, of which 1,343 were covered by the news agency Reuters, and (ii) 10,760 articles from the JAMA journals, of which 846 were given press releases by the journal editors. We show that media coverage can be predicted reasonably well: logistic regression achieves mean AUCs of 0.783 and 0.882 on the Reuters and JAMA datasets, respec- tively. We present and discuss interesting findings con- cerning the most predictive content features.

corpus, dataset, press release, (13 more...)

Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Privacy-Utility Trade-Off for Time-Series with Application to Smart-Meter Data

Erdogdu, Murat A. (Stanford University) | Fawaz, Nadia (Technicolor) | Montanari, Andrea (Stanford University)

We consider the online setting where a user would like to continuously release a time-series of data that is correlated with his private data, to a service provider in the hope of deriving some utility. Due to correlations, the continual observation of the released time-series puts the user at risk of inference of his private data by an adversary. To protect the user from inference attacks on his private data, the time-series is randomized prior to its release according to a probabilistic privacy mapping. The privacy mapping should be designed in a way that balances privacy and utility requirements over time.Our contributions are threefold. First, we formalize the framework for the design of utility-aware privacy mappings for time-series data, under both online and batch models. We provide a sequential scheme that allows to design online privacy mappings at scale, that account for privacy risk from the history of released data and future releases to come. Second, we prove the equivalence of the optimal mappings under the batch and the online models, in the case where the time-series samples are independent across time. We further show that there exists a gap between optimal batch and online privacy mappings when certain conditions are not satisfied.Finally, we evaluate the performance of the framework over synthetic and real-world time-series data. In particular, we show that smart-meter data can be randomized for privacy purposes to prevent disaggregation of per-device energy consumption, while preserving the utility.

inference, privacy mapping, time-series data, (15 more...)

Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Los Altos (0.04)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Energy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Ehsanfar, Abbas (Stevens Institute of Technology) | Heydari, Babak (Stevens Institute of Technology)

Interactive Multi-Consumer Power Cooperatives with Learning and Axiomatic Cost and Risk Disaggregation

This paper introduces a novel autonomous interactive learning cooperative (ILCP) who receives expected value and variance of load from consumers and participates in the electricity market on their behalf. Using an axiomatic approach, the share of each consumer's payment as well as its weight in calculating the modification of total day-ahead load are formulated. This scheme applies double-seasonal smoothing exponential, a recent load forecasting technique, and a classifier for real-time to day-ahead price direction forecasting (Gaussian Naïve Bayes). In addition to this, the ILCP employs interactive cooperative algorithms for both trading cooperative and consumer side. The ILCP scheme is investigated and its performance is compared to those of non-cooperative real-time pricing (RTP), LCP (non-interactive learning cooperative) and CP (non-interactive non-learning cooperative). The developed system was implemented using PJM(world's largest wholesale electricity market) real-time and day-ahead data for 2013 and half of 2014; real load profiles were selected from a set of 579 residential and commercial consumers, and weather data were applied to forecasting electricity price direction. We demonstrate the advantages of ILCP to lower the average electricity cost and to reduce unit price variations.

consumer, deviation, payment, (17 more...)

Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)

Industry:

Energy > Power Industry (1.00)
Education > Educational Setting > Online (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Bari, Nima, Vichr, Roman, Kowsari, Kamran, Berkovich, Simon Y.

Novel Metaknowledge-based Processing Technique for Multimedia Big Data clustering challenges

arXiv.org Artificial IntelligenceMar-1-2015

Past research has challenged us with the task of showing relational patterns between text-based data and then clustering for predictive analysis using Golay Code technique. We focus on a novel approach to extract metaknowledge in multimedia datasets. Our collaboration has been an on-going task of studying the relational patterns between datapoints based on metafeatures extracted from metaknowledge in multimedia datasets. Those selected are significant to suit the mining technique we applied, Golay Code algorithm. In this research paper we summarize findings in optimization of metaknowledge representation for 23-bit representation of structured and unstructured multimedia data in order to

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/BigMM.2015.78

1503.00245

Country: North America > United States (0.31)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.75)

Djolonga, Josip, Krause, Andreas

Scalable Variational Inference in Log-supermodular Models

arXiv.org Machine LearningFeb-24-2015

We consider the problem of approximate Bayesian inference in log-supermodular models. These models encompass regular pairwise MRFs with binary variables, but allow to capture high-order interactions, which are intractable for existing approximate inference techniques such as belief propagation, mean field, and variants. We show that a recently proposed variational approach to inference in log-supermodular models -L-FIELD- reduces to the widely-studied minimum norm problem for submodular minimization. This insight allows to leverage powerful existing tools, and hence to solve the variational problem orders of magnitude more efficiently than previously possible. We then provide another natural interpretation of L-FIELD, demonstrating that it exactly minimizes a specific type of R\'enyi divergence measure. This insight sheds light on the nature of the variational approximations produced by L-FIELD. Furthermore, we show how to perform parallel inference as message passing in a suitable factor graph at a linear convergence rate, without having to sum up over all the configurations of the factor. Finally, we apply our approach to a challenging image segmentation task. Our experiments confirm scalability of our approach, high quality of the marginals, and the benefit of incorporating higher-order potentials.

artificial intelligence, machine learning, scalable variational inference, (15 more...)

1502.06531

Country: Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Ndour, Cheikh, Diop, Aliou, Dossou-Gbété, Simplice

Classification approach based on association rules mining for unbalanced data

arXiv.org Machine LearningFeb-24-2015

This paper deals with the binary classification task when the target class has the lower probability of occurrence. In such situation, it is not possible to build a powerful classifier by using standard methods such as logistic regression, classification tree, discriminant analysis, etc. To overcome this short-coming of these methods which yield classifiers with low sensibility, we tackled the classification problem here through an approach based on the association rules learning. This approach has the advantage of allowing the identification of the patterns that are well correlated with the target class. Association rules learning is a well known method in the area of data-mining. It is used when dealing with large database for unsupervised discovery of local patterns that expresses hidden relationships between input variables. In considering association rules from a supervised learning point of view, a relevant set of weak classifiers is obtained from which one derives a classifier that performs well.

artificial intelligence, classifier, machine learning, (16 more...)

1202.5514

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)