AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection

Zaheer, Muhammad Zaigham, Mahmood, Arif, Astrid, Marcella, Lee, Seung-Ik

arXiv.org Artificial IntelligenceNov-24-2020

Learning to detect real-world anomalous events through videolevel labels is a challenging task due to the rare occurrence of anomalies as well as noise in the labels. In this work, we propose a weakly supervised anomaly detection method which has manifold contributions including 1) a random batch based training procedure to reduce inter-batch correlation, 2) a normalcy suppression mechanism to minimize anomaly scores of the normal regions of a video by taking into account the overall information available in one training batch, and 3) a clustering distance based loss to contribute towards mitigating the label noise and to produce better anomaly representations by encouraging our model to generate distinct normal and anomalous clusters. The proposed method obtains 83.03% and 89.67% frame-level AUC performance on the UCF-Crime and ShanghaiTech datasets respectively, demonstrating its superiority over the existing state-of-the-art algorithms.

computer vision, detection, video, (12 more...)

arXiv.org Artificial Intelligence

2011.12077

Country:

Asia > South Korea > Daejeon > Daejeon (0.04)
Asia > Pakistan > Punjab > Lahore Division > Lahore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(4 more...)

Add feedback

Nudge: Accelerating Overdue Pull Requests Towards Completion

Maddila, Chandra, Upadrasta, Sai Surya, Bansal, Chetan, Nagappan, Nachiappan, Gousios, Georgios, van Deursen, Arie

arXiv.org Artificial IntelligenceNov-24-2020

Pull requests are a key part of the collaborative software development and code review process today. However, pull requests can also slow down the software development process when the reviewer(s) or the author do not actively engage with the pull request. In this work, we design an end-to-end service, Nudge, for accelerating overdue pull requests towards completion by reminding the author or the reviewer(s) to engage with their overdue pull requests. First, we use models based on effort estimation and machine learning to predict the completion time for a given pull request. Second, we use activity detection to reduce false positives. Lastly, we use dependency determination to understand the blocker of the pull request and nudge the appropriate actor(author or reviewer(s)). We also do a correlation analysis to understand the statistical relationship between the pull request completion times and various pull request and developer related attributes. Nudge has been deployed on 147 repositories at Microsoft since 2019. We do a large scale evaluation based on the implicit and explicit feedback we received from sending the Nudge notifications on 8,500 pull requests. We observe significant reduction in completion time, by over 60%, for pull requests which were nudged thus increasing the efficiency of the code review process and accelerating the pull request progression.

notification, pull request, reviewer, (14 more...)

arXiv.org Artificial Intelligence

2011.12468

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > Netherlands > South Holland > Delft (0.04)
North America > United States > Washington > King County > Redmond (0.04)
(2 more...)

Genre:

Workflow (0.93)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Foundations of Bayesian Learning from Synthetic Data

Wilde, Harrison, Jewson, Jack, Vollmer, Sebastian, Holmes, Chris

arXiv.org Machine LearningNov-24-2020

There is significant growth and interest in the use of synthetic data as an enabler for machine learning in environments where the release of real data is restricted due to privacy or availability constraints. Despite a large number of methods for synthetic data generation, there are comparatively few results on the statistical properties of models learnt on synthetic data, and fewer still for situations where a researcher wishes to augment real data with another party's synthesised data. We use a Bayesian paradigm to characterise the updating of model parameters when learning in these settings, demonstrating that caution should be taken when applying conventional learning algorithms without appropriate consideration of the synthetic data generating process and learning task. Recent results from general Bayesian updating support a novel and robust approach to Bayesian synthetic-learning founded on decision theory that outperforms standard approaches across repeated experiments on supervised learning and inference problems.

experiment, inference, synthetic data, (14 more...)

arXiv.org Machine Learning

2011.08299

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Spain > Canary Islands (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

ROC Curve Explained in One Picture

#artificialintelligenceNov-23-2020, 14:40:52 GMT

With a ROC curve, you're trying to find a good model that optimizes the trade off between the False Positive Rate (FPR) and True Positive Rate (TPR). What counts here is how much area is under the curve (Area under the Curve AuC). The ideal curve in the left image fills in 100%, which means that you're going to be able to distinguish between negative results and positive results 100% of the time (which is almost impossible in real life). The further you go to the right, the worse the detection. The ROC curve to the far right does a worse job than chance, mixing up the negatives and positives (which means you likely have an error in your setup).

positive rate, roc curve explained

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Structure Learning in Inverse Ising Problems Using $\ell_2$-Regularized Linear Estimator

Meng, Xiangming, Obuchi, Tomoyuki, Kabashima, Yoshiyuki

arXiv.org Machine LearningNov-23-2020

The inference performance of the pseudolikelihood method is discussed in the framework of the inverse Ising problem when the $\ell_2$-regularized (ridge) linear regression is adopted. This setup is introduced for theoretically investigating the situation where the data generation model is different from the inference one, namely the model mismatch situation. In the teacher-student scenario under the assumption that the teacher couplings are sparse, the analysis is conducted using the replica and cavity methods, with a special focus on whether the presence/absence of teacher couplings is correctly inferred or not. The result indicates that despite the model mismatch, one can perfectly identify the network structure using naive linear regression without regularization when the number of spins $N$ is smaller than the dataset size $M$, in the thermodynamic limit $N\to \infty$. Further, to access the underdetermined region $M < N$, we examine the effect of the $\ell_2$ regularization, and find that biases appear in all the coupling estimates, preventing the perfect identification of the network structure. We, however, find that the biases are shown to decay exponentially fast as the distance from the center spin chosen in the pseudolikelihood method grows. Based on this finding, we propose a two-stage estimator: In the first stage, the ridge regression is used and the estimates are pruned by a relatively small threshold; in the second stage the naive linear regression is conducted only on the remaining couplings, and the resultant estimates are again pruned by another relatively large threshold. This estimator with the appropriate regularization coefficient and thresholds is shown to achieve the perfect identification of the network structure even in $0

coupling, estimator, regularization, (15 more...)

arXiv.org Machine Learning

doi: 10.1088/1742-5468/abfa10

2008.08342

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.75)

Add feedback

Conjecturing-Based Computational Discovery of Patterns in Data

Brooks, J. P., Edwards, D. J., Larson, C. E., Van Cleemput, N.

arXiv.org Machine LearningNov-23-2020

Modern machine learning methods are designed to exploit complex patterns in data regardless of their form, while not necessarily revealing them to the investigator. Here we demonstrate situations where modern machine learning methods are ill-equipped to reveal feature interaction effects and other nonlinear relationships. We propose the use of a conjecturing machine that generates feature relationships in the form of bounds for numerical features and boolean expressions for nominal features that are ignored by machine learning algorithms. The proposed framework is demonstrated for a classification problem with an interaction effect and a nonlinear regression problem. In both settings, true underlying relationships are revealed and generalization performance improves. The framework is then applied to patient-level data regarding COVID-19 outcomes to suggest possible risk factors.

conjecture, expression, invariant, (12 more...)

arXiv.org Machine Learning

2011.11576

Country:

North America > United States > New York (0.04)
North America > United States > Virginia > Richmond (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)

Genre: Research Report (0.65)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

The Best Machine Learning Algorithm for Handwritten Digits Recognition

#artificialintelligenceNov-22-2020, 21:25:43 GMT

Handwritten Digit Recognition is an interesting machine learning problem in which we have to identify the handwritten digits through various classification algorithms. There are a number of ways and algorithms to recognize handwritten digits, including Deep Learning/CNN, SVM, Gaussian Naive Bayes, KNN, Decision Trees, Random Forests, etc. In this article, we will deploy a variety of machine learning algorithms from the Sklearn's library on our dataset to classify the digits into their categories. We will use Sklearn's load_digits dataset, which is a collection of 8x8 images (64 features)of digits. The dataset contains a total of 1797 sample points.

Add feedback

Unfolding the Maths behind Ridge and Lasso Regression!

#artificialintelligenceNov-22-2020, 18:05:34 GMT

This article was published as a part of the Data Science Blogathon. Many times we have come across this statement – Lasso regression causes sparsity while Ridge regression doesn't! But I'm pretty sure that most of us might not have understood how exactly this works. Let's try to understand this using calculus. First, let's understand what sparsity is.

regression, regularization, ridge regression, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.72)

Add feedback

Cancer image classification based on DenseNet model

Zhong, Ziliang, Zheng, Muhang, Mai, Huafeng, Zhao, Jianan, Liu, Xinyi

arXiv.org Machine LearningNov-22-2020

Computer-aided diagnosis establishes methods for robust assessment of medical image-based examination. Image processing introduced a promising strategy to facilitate disease classification and detection while diminishing unnecessary expenses. In this paper, we propose a novel metastatic cancer image classification model based on DenseNet Block, which can effectively identify metastatic cancer in small image patches taken from larger digital pathology scans. We evaluate the proposed approach to the slightly modified version of the PatchCamelyon (PCam) benchmark dataset. The dataset is the slightly modified version of the PatchCamelyon (PCam) benchmark dataset provided by Kaggle competition, which packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task. The experiments indicated that our model outperformed other classical methods like Resnet34, Vgg19. Moreover, we also conducted data augmentation experiment and study the relationship between Batches processed and loss value during the training and validation process.

cancer, classification, image classification, (12 more...)

arXiv.org Machine Learning

doi: 10.1088/issn.1742-6596

2011.11186

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Diagnostic Medicine (0.93)
Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.30)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.96)

Add feedback

Positive and Unlabeled Materials Machine Learning

#artificialintelligenceNov-21-2020, 18:00:45 GMT

Many real-world problems involve datasets where only some of the data is labeled and the rest is unlabeled. In this post, we discuss our implementation of semi-supervised learning for predicting the synthesizability of theoretical materials. When we think about the materials that will enable next-generation technologies, it's probably not the case that there is one ultimate material waiting to be found that will solve all our problems. The problems we need to solve (producing and storing clean energy, mitigating climate change, desalinating water, etc.) are complex and varied. Even zooming in to the next-generation of electronics, computers, and nanotechnology, there probably isn't a single perfect material to exploit in the same way that silicon has been used in all our familiar devices.

compound, synthesizability, unlabeled material machine learning, (12 more...)

#artificialintelligence

Industry: Energy (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.35)

Add feedback