GPS is an incredibly powerful tool that allows devices such as your smartphone to know roughly where they are with an accuracy of around a meter in some cases. However, this is largely too inaccurate for many use cases and that accuracy drops considerably when inside such as warehouse robots that rely on barcodes on the floor. In response, researchers [Linguang Zhang, Adam Finkelstein, Szymon Rusinkiewicz] at Princeton have developed a system they refer to as MicroGPS that uses pictures of the ground to determine its location with sub-centimeter accuracy. The system has a downward-facing monochrome camera with a light shield to control for exposure. Camera output feeds into an Nvidia Jetson TX1 platform for processing.
Machine Learning (ML) has achieved unprecedented performance in several applications including image, speech, text, and data analysis. Use of ML to understand underlying patterns in gene mutations (genomics) has far-reaching results, not only in overcoming diagnostic pitfalls, but also in designing treatments for life-threatening diseases like cancer. Success and sustainability of ML algorithms depends on the quality and diversity of data collected and used for training. Under-representation of groups (ethnic groups, gender groups, etc.) in such a dataset can lead to inaccurate predictions for certain groups, which can further exacerbate systemic discrimination issues. In this work, we propose TRAPDOOR, a methodology for identification of biased datasets by repurposing a technique that has been mostly proposed for nefarious purposes: Neural network backdoors. We consider a typical collaborative learning setting of the genomics supply chain, where data may come from hospitals, collaborative projects, or research institutes to a central cloud without awareness of bias against a sensitive group. In this context, we develop a methodology to leak potential bias information of the collective data without hampering the genuine performance using ML backdooring catered for genomic applications. Using a real-world cancer dataset, we analyze the dataset with the bias that already existed towards white individuals and also introduced biases in datasets artificially, and our experimental result show that TRAPDOOR can detect the presence of dataset bias with 100% accuracy, and furthermore can also extract the extent of bias by recovering the percentage with a small error.
A core concept in the area of data science is that machine learning is a process of building models that learn from data. ML models are as reliable as the dataset used to train them. There is a saying "Garbage in, garbage out" related to the quality of data. Accuracy refers also to possible errors or artifacts (errors caused by the processing of data), such as duplicate records, typos, measurement inconsistency among others. Overlooking problems related to accuracy causes degradation on the ML algorithms.
Learning from small amounts of labeled data is a challenge in the area of deep learning. This is currently addressed by Transfer Learning where one learns the small data set as a transfer task from a larger source dataset. Transfer Learning can deliver higher accuracy if the hyperparameters and source dataset are chosen well. One of the important parameters is the learning rate for the layers of the neural network. We show through experiments on the ImageNet22k and Oxford Flowers datasets that improvements in accuracy in range of 127% can be obtained by proper choice of learning rates. We also show that the images/label parameter for a dataset can potentially be used to determine optimal learning rates for the layers to get the best overall accuracy. We additionally validate this method on a sample of real-world image classification tasks from a public visual recognition API.
Explainable AI provides insight into the "why" for model predictions, offering potential for users to better understand and trust a model, and to recognize and correct AI predictions that are incorrect. Prior research on human and explainable AI interactions has focused on measures such as interpretability, trust, and usability of the explanation. Whether explainable AI can improve actual human decision-making and the ability to identify the problems with the underlying model are open questions. Using real datasets, we compare and evaluate objective human decision accuracy without AI (control), with an AI prediction (no explanation), and AI prediction with explanation. We find providing any kind of AI prediction tends to improve user decision accuracy, but no conclusive evidence that explainable AI has a meaningful impact. Moreover, we observed the strongest predictor for human decision accuracy was AI accuracy and that users were somewhat able to detect when the AI was correct versus incorrect, but this was not significantly affected by including an explanation. Our results indicate that, at least in some situations, the "why" information provided in explainable AI may not enhance user decision-making, and further research may be needed to understand how to integrate explainable AI into real systems.