Accuracy
A Noise-Filtering Approach for Cancer Drug Sensitivity Prediction
Accurately predicting drug responses to cancer is an important problem hindering oncologists' efforts to find the most effective drugs to treat cancer, which is a core goal in precision medicine. The scientific community has focused on improving this prediction based on genomic, epigenomic, and proteomic datasets measured in human cancer cell lines. Real-world cancer cell lines contain noise, which degrades the performance of machine learning algorithms. This problem is rarely addressed in the existing approaches. In this paper, we present a noise-filtering approach that integrates techniques from numerical linear algebra and information retrieval targeted at filtering out noisy cancer cell lines. By filtering out noisy cancer cell lines, we can train machine learning algorithms on better quality cancer cell lines. We evaluate the performance of our approach and compare it with an existing approach using the Area Under the ROC Curve (AUC) on clinical trial data. The experimental results show that our proposed approach is stable and also yields the highest AUC at a statistically significant level.
Microsoft researchers detect lung-cancer risks in web search logs - Next at Microsoft
Smoking cigarettes is the leading cause of lung cancer, the most common cause of cancer death in the world. But nearly 20 percent of lung-cancer diagnoses are made in people who are non-smokers. That means in addition to smoking, geographic, demographic and genetic factors play a role in the devastating disease. A project from Microsoft's research labs is exploring the feasibility of using anonymized web search data to learn more about lung-cancer risk factors and provide early warning to people who are candidates for disease screening. The findings, published Thursday in JAMA Oncology, extend research that team members published last June on the feasibility of using the text of questions people ask search engines to predict diagnoses of pancreatic cancer.
Artificial intelligence software can spot child sexual abuse media online Latest News & Updates at Daily News & Analysis
Artificial intelligence software can now help cops spot new or previously unknown child sexual abuse media and prosecute offenders. The toolkit, described in a paper published in Digital Investigation, automatically detects new child sexual abuse photos and videos in online peer-to-peer networks. The new approach combines automatic filename and media analysis techniques in an intelligent filtering module, which can identify new criminal media and distinguish it from other media being shared, such as adult pornography. Spotting newly produced media online can give law enforcement agencies the fresh evidence they need to find and prosecute offenders. "Identifying new child sexual abuse media is critical because it can indicate recent or ongoing child abuse," said lead study author Claudia Peersman from Lancaster University.
Machines are learning to find concealed weapons in X-ray scans
EVERY day more than 8,000 containers flow through the Port of Rotterdam. But only a fraction are selected to pass through a giant x-ray machine to check for illicit contents. The machine, made by Rapiscan, an American firm, can capture images as the containers move along a track at 15kph (9.3mph). But it takes time for a human to inspect each scan for anything suspicious--and in particular for small metallic objects that might be weapons. To increase this inspection rate would require a small army of people.
How to evaluate Data Science models ?
Lift Charts & Gain Charts: These are widely used in campaign targeting problems, to determine which decile can we target customers for a specific campaign. Also, it tells you how much response you can expect from the new target base. ROC Curve: The ROC curve is the plot between false positive rate and True Positive rate. Gini coefficient: This is the ratio of area between the ROC curve and the diagonal line & the area of the above triangle Cross Validation: splitting the data into two parts, where one part is used for "training" your model, and the second part is used to make predictions. By this you can test the model on the data that was "not seen" by it previously, and check how it could possibly behave with external data.
How Should a Society Be?
My academic background is in computer science and philosophy. My work has been about the relationship between those two fields. What do we learn about being human by thinking about the quest to create artificial intelligence? What do we learn about human decision making by thinking of human problems in computational terms? The questions that have interested me over the years have been, on the one hand, what defines human intelligence at a species level? And secondly, at an individual level, how do we approach decision making in our own lives, and what are the problems that the world throws at us? I find myself interested at the group level, the society level, and the civic level in a couple of different ways. I've been encouraged by what I've seen over the last few years in terms of the norms of the sciences changing. It used to be that people were scared to publish their models because that was the secret sauce; that was their advantage over other research groups.
A novel multiclassSVM based framework to classify lithology from well logs: a real-world application
Chaki, Soumi, Routray, Aurobinda, Mohanty, William K., Jenamani, Mamata
Support vector machines (SVMs) have been recognized as a potential tool for supervised classification analyses in different domains of research. In essence, SVM is a binary classifier. Therefore, in case of a multiclass problem, the problem is divided into a series of binary problems which are solved by binary classifiers, and finally the classification results are combined following either the one-against-one or one-against-all strategies. In this paper, an attempt has been made to classify lithology using a multiclass SVM based framework using well logs as predictor variables. Here, the lithology is classified into four classes such as sand, shaly sand, sandy shale and shale based on the relative values of sand and shale fractions as suggested by an expert geologist. The available dataset consisting well logs (gamma ray, neutron porosity, density, and P-sonic) and class information from four closely spaced wells from an onshore hydrocarbon field is divided into training and testing sets. We have used one-against-all strategy to combine the results of multiple binary classifiers. The reported results established the superiority of multiclass SVM compared to other classifiers in terms of classification accuracy. The selection of kernel function and associated parameters has also been investigated here. It can be envisaged from the results achieved in this study that the proposed framework based on multiclass SVM can further be used to solve classification problems. In future research endeavor, seismic attributes can be introduced in the framework to classify the lithology throughout a study area from seismic inputs.
A One class Classifier based Framework using SVDD : Application to an Imbalanced Geological Dataset
Chaki, Soumi, Verma, Akhilesh Kumar, Routray, Aurobinda, Mohanty, William K., Jenamani, Mamata
Evaluation of hydrocarbon reservoir requires classification of petrophysical properties from available dataset. However, characterization of reservoir attributes is difficult due to the nonlinear and heterogeneous nature of the subsurface physical properties. In this context, present study proposes a generalized one class classification framework based on Support Vector Data Description (SVDD) to classify a reservoir characteristic water saturation into two classes (Class high and Class low) from four logs namely gamma ray, neutron porosity, bulk density, and P sonic using an imbalanced dataset. A comparison is carried out among proposed framework and different supervised classification algorithms in terms of g metric means and execution time. Experimental results show that proposed framework has outperformed other classifiers in terms of these performance evaluators. It is envisaged that the classification analysis performed in this study will be useful in further reservoir modeling.
How Do You Keep People Safe When Peanut Butter Looks The Same As Explosives To An X-Ray?
This has not been shown in public. And no, it does not make coffee. Deep in the heart of the 17,000 attendees at Slush – the insanely busy, loud and dark conference in Helsinki – I found Karsa.fi a small stand with a big claim; they keep bombs off planes. Plane travel – while safer than it has ever been is still a concern for those in the planes as those on the ground. Terrorism – or rather fear of it – is at record levels (despite figures that suggest a decline in major incidents) so Karsa.fi's