AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Exploring the Semantic Content of Unsupervised Graph Embeddings: An Empirical Study

Bonner, Stephen, Kureshi, Ibad, Brennan, John, Theodoropoulos, Georgios, McGough, Andrew Stephen, Obara, Boguslaw

arXiv.org Machine LearningJun-19-2018

Graph embeddings have become a key and widely used technique within the field of graph mining, proving to be successful across a broad range of domains including social, citation, transportation and biological. Graph embedding techniques aim to automatically create a low-dimensional representation of a given graph, which captures key structural elements in the resulting embedding space. However, to date, there has been little work exploring exactly which topological structures are being learned in the embeddings process. In this paper, we investigate if graph embeddings are approximating something analogous with traditional vertex level graph features. If such a relationship can be found, it could be used to provide a theoretical insight into how graph embedding approaches function. We perform this investigation by predicting known topological features, using supervised and unsupervised methods, directly from the embedding space. If a mapping between the embeddings and topological features can be found, then we argue that the structural information encapsulated by the features is represented in the embedding space. To explore this, we present extensive experimental evaluation from five state-of-the-art unsupervised graph embedding techniques, across a range of empirical graph datasets, measuring a selection of topological features. We demonstrate that several topological features are indeed being approximated by the embedding space, allowing key insight into how graph embeddings create good representations.

data mining, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

1806.07464

Country:

Europe > United Kingdom > England > Durham > Durham (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Education (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Outcome-Oriented Predictive Process Monitoring: Review and Benchmark

Teinemaa, Irene, Dumas, Marlon, La Rosa, Marcello, Maggi, Fabrizio Maria

arXiv.org Artificial IntelligenceJun-19-2018

Traditional process monitoring techniques provide dashboards and reports showing the recent performance of a business process in terms of key performance indicators such as mean execution time, resource utilization or error rate with respect to a given notion of error. Predictive (business) process monitoring techniques go beyond traditional ones by making predictions about the future state of the executions of a business process (herein called cases). For example, a predictive monitoring technique may seek to predict the remaining execution time of each ongoing case of a process [29], the next activity that will be executed in each case [11], or the final outcome of a case, with respect to a possible set of business outcomes [23-25]. For instance, in an order-to-cash process (a process going from the receipt of a purchase order to the receipt of payment of the corresponding invoice), the possible outcomes of a case may be that the purchase order is closed satisfactorily (i.e., the customer accepted the products and paid) or unsatisfactorily (e.g., the order was canceled or withdrawn). Another set of possible outcomes is that the products were delivered on time (with respect to a maximum acceptable delivery time), or delivered late. Recent years have seen the emergence of a rich field of proposed methods for predictive process monitoring in general, and predictive monitoring of (categorical) case outcomes in particular - herein called outcome-oriented predictive process monitoring. Unfortunately, there is no unified approach to evaluate these methods. Indeed, different authors have used different datasets, experimental settings, evaluation measures and baselines.

data mining, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1707.06766

Country:

Europe > Estonia > Tartu County > Tartu (0.04)
Europe > United Kingdom > England > Staffordshire > Keele (0.04)
Asia (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.67)
Banking & Finance (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Add feedback

Microsoft weeds out fake marketing leads with Naïve Bayes and Machine Learning Server

#artificialintelligenceJun-18-2018, 03:37:20 GMT

To connect with potential customers, our marketers and sellers at Microsoft depend on good-quality leads. But sometimes people fill out online forms with fake names, gibberish, or even profanity. We distinguish fake company names from legitimate names in our data using the programming language R, the Naive Bayes classifier algorithm, Microsoft Machine Learning Server, and a data quality service that we built. This solution helps us weed out fake names and prioritize good leads for our sales and marketing teams.

artificial intelligence, fake marketing lead, machine learning server, (3 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

MultiFIT: Multivariate Multiscale Framework for Independence Tests

Gorsky, Shai, Ma, Li

arXiv.org Machine LearningJun-18-2018

We present a framework for testing independence between two random vectors that is scalable to massive data. Taking a "divide-and-conquer" approach, we break down the nonparametric multivariate test of independence into simple univariate independence tests on a collection of $2\times 2$ contingency tables, constructed by sequentially discretizing the original sample space at a cascade of scales from coarse to fine. This transforms a complex nonparametric testing problem---that traditionally requires quadratic computational complexity with respect to the sample size---into a multiple testing problem that can be addressed with a computational complexity that scales almost linearly with the sample size. We further consider the scenario when the dimensionality of the two random vectors also grows large, in which case the curse of dimensionality arises in the proposed framework through an explosion in the number of univariate tests to be completed. To overcome this difficulty, we propose a data-adaptive version of our method that completes a fraction of the univariate tests, judged to be more likely to contain evidence for dependency based on exploiting the spatial characteristics of the dependency structure in the data. We provide an inference recipe based on multiple testing adjustment that guarantees the inferential validity in terms of properly controlling the family-wise error rate. We demonstrate the tremendous computational advantage of the algorithm in comparison to existing approaches while achieving desirable statistical power through an extensive simulation study. In addition, we illustrate how our method can be used for learning the nature of the underlying dependency in addition to hypothesis testing. We demonstrate the use of our method through analyzing a data set from flow cytometry.

artificial intelligence, independence, machine learning, (18 more...)

arXiv.org Machine Learning

1806.06777

Country: North America > United States > North Carolina > Durham County > Durham (0.04)

Genre: Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback

Evaluating and Characterizing Incremental Learning from Non-Stationary Data

Cervantes, Alejandro, Gagné, Christian, Isasi, Pedro, Parizeau, Marc

arXiv.org Machine LearningJun-18-2018

Incremental learning from non-stationary data poses special challenges to the field of machine learning. Although new algorithms have been developed for this, assessment of results and comparison of behaviors are still open problems, mainly because evaluation metrics, adapted from more traditional tasks, can be ineffective in this context. Overall, there is a lack of common testing practices. This paper thus presents a testbed for incremental non-stationary learning algorithms, based on specially designed synthetic datasets. Also, test results are reported for some well-known algorithms to show that the proposed methodology is effective at characterizing their strengths and weaknesses. It is expected that this methodology will provide a common basis for evaluating future contributions in the field.

artificial intelligence, evolutionary algorithm, machine learning, (15 more...)

arXiv.org Machine Learning

1806.0661

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)
(9 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Cross-validation Tutorial: What, how and which?

#artificialintelligenceJun-17-2018, 03:31:56 GMT

"Statistics [from cross-validation] are like bikinis. Training set Test set 2 4. P. Raamana Goals for Today • What is cross-validation? Training set Test set ℵ 2 5. P. Raamana Goals for Today • What is cross-validation? Training set Test set ℵ 2 6. Training set Test set ℵ negative bias unbiased positive bias 2 7. P. Raamana What is generalizability? Training set Test set 5 18. Training set Test set bigger training set better learning 5 19. Training set Test set bigger training set better learning better testing bigger test set 5 20. Training set Test set bigger training set better learning better testing bigger test set Key: Train & test sets must be disjoint. Training set Test set bigger training set better learning better testing bigger test set Key: Train & test sets must be disjoint. And the dataset or sample size is fixed. Training set Test set bigger training set better learning better testing bigger test set Key: Train & test sets must be disjoint.

artificial intelligence, machine learning, validation, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (1.00)

Add feedback

Why Won't Facebook Talk About How Often Its Algorithms Are Wrong?

Forbes - TechJun-17-2018, 01:55:05 GMT

Two weeks ago Facebook released yet another glossy marketing infographic site and video touting how its state of the art technology, top engineers and teams of experts have made massive strides in conquering yet another scourge of the online world through the power of advanced algorithms. This past week its EMEA counterterrorism lead announced that its algorithms were now deleting 99% of all ISIS and al-Qaida terrorism content across the site. As with all of Facebook's announcements to date, neither of these proclamations made any mention of how often the algorithms that increasingly control its platform are wrong and whether they are actually right more often than they are wrong. After initially promising to provide a response, the company once again declined to comment on the false positive rates of its algorithms or why despite repeated requests it continues to refuse to release those numbers. Why is the company so afraid to talk about whether its algorithms are actually accurate?

artificial intelligence, machine learning, social media, (15 more...)

Forbes - Tech

AI-Alerts: 2018 > 2018-06 > AAAI AI-Alert for Jun 19, 2018 (1.00)

Country: Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.05)

Industry:

Information Technology > Services (0.98)
Law Enforcement & Public Safety > Terrorism (0.60)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Binary Classification in Unstructured Space With Hypergraph Case-Based Reasoning

Quemy, Alexandre

arXiv.org Artificial IntelligenceJun-16-2018

Binary classification is one of the most common problem in machine learning. It consists in predicting whether a given element is of a particular class. In this paper, a new algorithm for binary classification is proposed using a hypergraph representation. Each element to be classified is partitioned according to its interactions with the training set. For each class, the total support is calculated as a convex combination of the {\it evidence} strength of the element of the partition. The evidence measure is pre-computed using the hypergraph induced by the training set and iteratively adjusted through a training phase. It does not require structured information, each case being represented by a set of {\it agnostic information} atoms. Empirical validation demonstrates its high potential on a wide range of well-known datasets and the results are compared to the state-of-art. The time complexity is given and empirically validated. Its capacity to provide good performances without hyperparameter tuning compared to standard classification methods is studied. Finally, the limitation of the model space is discussed and some potential solutions proposed.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Artificial Intelligence

1806.06232

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Poland > Lesser Poland Province > Kraków (0.14)
Europe > Poland > Greater Poland Province > Poznań (0.04)
Asia > India (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

WWE Money In The Bank 2018: Predictions, Match Card, Preview For Wrestling PPV

International Business TimesJun-15-2018, 14:10:25 GMT

There's a lot on the line at WWE Money in the Bank 2018, which has clearly become the most important pay-per-view that isn't among the "Big Four." Not only will five titles be defended, but world championship opportunities are also up for grabs Sunday night. Money in the Bank features 10 matches, including Ronda Rousey's first singles match, two ladder matches and a last man standing match. Styles and Nakamura have been feuding for the better part of three months. WWE has had plenty of chances to put the WWE Championship on the Japanese superstar, yet Styles has continued to hold the belt, even with Nakamura's heel turn.

artificial intelligence, machine learning, prediction, (15 more...)

International Business Times

Country:

Oceania > Samoa (0.05)
North America > United States > Nevada > Clark County > Las Vegas (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)

Industry: Leisure & Entertainment > Sports > Martial Arts (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.43)

Add feedback

Machine learning "red dot": open-source, cloud, deep convolutional neural networks in chest radiograph binary normality classification. - PubMed - NCBI

#artificialintelligenceJun-15-2018, 11:41:35 GMT

To develop a machine learning-based model for the binary classification of chest radiography abnormalities, to serve as a retrospective tool in guiding clinician reporting prioritisation. The open-source machine learning library, Tensorflow, was used to retrain a final layer of the deep convolutional neural network, Inception, to perform binary normality classification on two, anonymised, public image datasets. Re-training was performed on 47,644 images using commodity hardware, with validation testing on 5,505 previously unseen radiographs. Confusion matrix analysis was performed to derive diagnostic utility metrics. This study demonstrates the application of a machine learning-based approach to classify chest radiographs as normal or abnormal.

artificial intelligence, deep convolutional neural network, machine learning, (8 more...)

#artificialintelligence

Genre: Research Report > Experimental Study (0.55)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.43)

Add feedback