evaluation


Artificial intelligence improves seismic analyses

#artificialintelligence

The challenge to analyze earthquake signals with optimum precision grows along with the amount of available seismic data. At the Karlsruhe Institute of Technology (KIT), researchers have deployed a neural network to determine the arrival-time of seismic waves and thus precisely locate the epicenter of the earthquake. In their report in the Seismological Research Letters journal, they point out that Artificial Intelligence is able to evaluate the data with the same precision as an experienced seismologist. For precisely locating an earthquake event, it is critical to determine the exact arrival-time of the majority of seismic waves at the seismometer station (the so-called phase arrival). Without this knowledge, further accurate seismological evaluations are not possible.


Towards Robust and Verified AI: Specification Testing, Robust Training, and Formal Verification DeepMind

#artificialintelligence

This is not an entirely new problem. Computer programs have always had bugs. Over decades, software engineers have assembled an impressive toolkit of techniques, ranging from unit testing to formal verification. These methods work well on traditional software, but adapting these approaches to rigorously test machine learning models like neural networks is extremely challenging due to the scale and lack of structure in these models, which may contain hundreds of millions of parameters. This necessitates the need for developing novel approaches for ensuring that machine learning systems are robust at deployment.


Artificial Intelligence Can Be Biased: Why That Matters - The Good Men Project

#artificialintelligence

As more data is gathered from individuals, and artificial intelligence (AI) is employed to make important decisions, we must continue to ask precisely who is benefiting. We must also be intentional about inclusive and ethical AI. For example, in the last few years, advances in AI have renewed hope in the capacity of technology to help drive precision healthcare--the ability to deliver the right treatment to the right person at the right time seems ever more attainable. With enough health data, we can train algorithms to make precise medical decisions. Yet my work with the Algorithmic Justice League, along with mounting research studies (including Gender Shades), show artificial intelligence can be biased if its creators are not intentional about gathering inclusive data or ignore evaluations of results by diverse subgroups.


Ambiverse - an amazing open-source suite for natural language understanding

#artificialintelligence

While doing performance benchmarks for Named Entity Linking solutions for our AI/FinTech start-up Risklio, I stumbled upon a very powerful, only just open-sourced framework called AmbiverseNLU. It was developed by Ambiverse and is based on work previously done at the Max Planck Institute¹. The components it uses are more well-known: entity recognition from KnowNER², open information extraction using ClausIE³ and AIDA, an entity detection and disambiguation tool⁴. You can have a look at the demo here. For the former one you can choose whether to use Apache Cassandra or PostgreSQL as a backend, while the last one uses Neo4j.


Ambiverse - an amazing open-source suite for natural language understanding

#artificialintelligence

While doing performance benchmarks for Named Entity Linking solutions for our AI/FinTech start-up Risklio, I stumbled upon a very powerful, only just open-sourced framework called AmbiverseNLU. It was developed by Ambiverse and is based on work previously done at the Max Planck Institute¹. The components it uses are more well-known: entity recognition from KnowNER², open information extraction using ClausIE³ and AIDA, an entity detection and disambiguation tool⁴. You can have a look at the demo here. For the former one you can choose whether to use Apache Cassandra or PostgreSQL as a backend, while the last one uses Neo4j.


Artificial Intelligence Promises a Personalized Education for All - The Possibility Report

#artificialintelligence

In a 2015 interview, Bill Gates imagined a world where Artificially Intelligent Tutoring Systems (AITS) have transformed learning. He spoke of AI-powered tutors offering a personalized approach for each student. They could work with a kid struggling to wrap his head around algebra while his classmates moved on to something more advanced; they could work with a grandmother determined to learn a new language. These systems wouldn't replace teachers. Rather, they'd enhance human teachers' abilities to tailor lessons to each student without knocking their class schedule off track.


Artificial intelligence better than humans at spotting lung cancer

#artificialintelligence

The condition is the leading cause of cancer-related death in the U.S., and early detection is crucial for both stopping the spread of tumors and improving patient outcomes. As an alternative to chest X-rays, healthcare professionals have recently been using computed tomography (CT) scans to screen for lung cancer. In fact, some scientists argue that CT scans are superior to X-rays for lung cancer detection, and research has shown that low-dose CT (LDCT) in particular has reduced lung cancer deaths by 20%. These errors typically delay the diagnosis of lung cancer until the disease has reached an advanced stage when it becomes too difficult to treat. New research may safeguard against these errors.


Data-driven preference learning methods for value-driven multiple criteria sorting with interacting criteria

arXiv.org Machine Learning

The learning of predictive models for data-driven decision support has been a prevalent topic in many fields. However, construction of models that would capture interactions among input variables is a challenging task. In this paper, we present a new preference learning approach for multiple criteria sorting with potentially interacting criteria. It employs an additive piecewise-linear value function as the basic preference model, which is augmented with components for handling the interactions. To construct such a model from a given set of assignment examples concerning reference alternatives, we develop a convex quadratic programming model. Since its complexity does not depend on the number of training samples, the proposed approach is capable for dealing with data-intensive tasks. To improve the generalization of the constructed model on new instances and to overcome the problem of over-fitting, we employ the regularization techniques. We also propose a few novel methods for classifying non-reference alternatives in order to enhance the applicability of our approach to different datasets. The practical usefulness of the proposed method is demonstrated on a problem of parametric evaluation of research units, whereas its predictive performance is studied on several monotone learning datasets. The experimental results indicate that our approach compares favourably with the classical UTADIS method and the Choquet integral-based sorting model.


Quantitative Error Prediction of Medical Image Registration using Regression Forests

arXiv.org Machine Learning

Predicting registration error can be useful for evaluation of registration procedures, which is important for the adoption of registration techniques in the clinic. In addition, quantitative error prediction can be helpful in improving the registration quality. The task of predicting registration error is demanding due to the lack of a ground truth in medical images. This paper proposes a new automatic method to predict the registration error in a quantitative manner, and is applied to chest CT scans. A random regression forest is utilized to predict the registration error locally. The forest is built with features related to the transformation model and features related to the dissimilarity after registration. The forest is trained and tested using manually annotated corresponding points between pairs of chest CT scans in two experiments: SPREAD (trained and tested on SPREAD) and inter-database (including three databases SPREAD, DIR-Lab-4DCT and DIR-Lab-COPDgene). The results show that the mean absolute errors of regression are 1.07 $\pm$ 1.86 and 1.76 $\pm$ 2.59 mm for the SPREAD and inter-database experiment, respectively. The overall accuracy of classification in three classes (correct, poor and wrong registration) is 90.7% and 75.4%, for SPREAD and inter-database respectively. The good performance of the proposed method enables important applications such as automatic quality control in large-scale image analysis.


Predicting Model Failure using Saliency Maps in Autonomous Driving Systems

arXiv.org Machine Learning

While machine learning systems show high success rate in many complex tasks, research shows they can also fail in very unexpected situations. Rise of machine learning products in safety-critical industries cause an increase in attention in evaluating model robustness and estimating failure probability in machine learning systems. In this work, we propose a design to train a student model -- a failure predictor -- to predict the main model's error for input instances based on their saliency map. We implement and review the preliminary results of our failure predictor model on an autonomous vehicle steering control system as an example of safety-critical applications.