Accuracy
RStudio AI Blog: Starting to think about AI Fairness
The topic of AI fairness metrics is as important to society as it is confusing. Confusing it is due to a number of reasons: terminological proliferation, abundance of formulae, and last not least the impression that everyone else seems to know what they're talking about. This text hopes to counteract some of that confusion by starting from a common-sense approach of contrasting two basic positions: On the one hand, the assumption that dataset features may be taken as reflecting the underlying concepts ML practitioners are interested in; on the other, that there inevitably is a gap between concept and measurement, a gap that may be bigger or smaller depending on what is being measured. In contrasting these fundamental views, we bring together concepts from ML, legal science, and political philosophy.
Confusion Matrix
In machine learning, a confusion matrix is an nxn matrix such that each row represents the true classification of a given piece of data and each column represents the predicted classification (or vise versa). By looking at a confusion matrix, one can determine the accuracy of the model by looking at the values on the diagonal to determine the number of correct classifications - a good model will have high values along the diagonal and low values off the diagonal. Further, one can tell where the model is struggling by assessing the highest values not on the diagonal. Together, these analyses are useful to identify cases where the accuracy may be high but the model is consistently misclassifying the same data. Here is an example of a confusion matrix created by a neural network analyzing the MNIST dataset.
Exponentially Tilted Gaussian Prior for Variational Autoencoder
Floto, Griffin, Kremer, Stefan, Nica, Mihai
An important propertyfor deep neural networks to possess is the ability to perform robust out of distribution detection (OOD) on previously unseen data. This property is essential for safety purposes when deploying models for real world applications. Recent studies show that probabilistic generative models can perform poorly on this task, which is surprising given that they seek to estimate the likelihood of training data. To alleviate this issue, we propose the exponentially tilted Gaussian prior distribution for the Variational Autoencoder (VAE). With this prior, we are able to achieve state-of-the art results using just the negative log likelihood that the VAE naturally assigns, while being orders of magnitude faster than some competitive methods. We also show that our model produces high quality image samples which are more crisp than that of a standard Gaussian VAE. The new prior distribution has a very simple implementation which uses a Kullback Leibler divergence that compares the difference between a latent vector's length, and the radius of a sphere.
ARTSeg: Employing Attention for Thermal images Semantic Segmentation
Munir, Farzeen, Azam, Shoaib, Fatima, Unse, Jeon, Moongu
The research advancements have made the neural network algorithms deployed in the autonomous vehicle to perceive the surrounding. The standard exteroceptive sensors that are utilized for the perception of the environment are cameras and Lidar. Therefore, the neural network algorithms developed using these exteroceptive sensors have provided the necessary solution for the autonomous vehicle's perception. One major drawback of these exteroceptive sensors is their operability in adverse weather conditions, for instance, low illumination and night conditions. The useability and affordability of thermal cameras in the sensor suite of the autonomous vehicle provide the necessary improvement in the autonomous vehicle's perception in adverse weather conditions. The semantics of the environment benefits the robust perception, which can be achieved by segmenting different objects in the scene. In this work, we have employed the thermal camera for semantic segmentation. We have designed an attention-based Recurrent Convolution Network (RCNN) encoder-decoder architecture named ARTSeg for thermal semantic segmentation. The main contribution of this work is the design of encoder-decoder architecture, which employ units of RCNN for each encoder and decoder block. Furthermore, additive attention is employed in the decoder module to retain high-resolution features and improve the localization of features. The efficacy of the proposed method is evaluated on the available public dataset, showing better performance with other state-of-the-art methods in mean intersection over union (IoU).
ShotSpotter: AI at it's Worst
Sixty-five-year-old Michael Williams was released from jail last month after spending almost a year in jail on a murder charge. The "gunshot" sound that pointed the finger at Williams was initially classified as a firework by the AI. After the charges were dropped due to "insufficient evidence" it was revealed that one of ShotSpotter's human "reviewers" had changed the data to fit the crime, reclassifying the sound as a gunshot instead of a firework [1]. The case highlighted the dangers that the system poses to civil liberties and brings to question how much power we should give to AI "witnesses", especially those that can easily be tampered with. Shotspotter is a patented acoustic gunshot detection system of microphones, algorithms, and human reviewers that alerts police to potential gunfire [2].
Why Darktrace Installs a Hooli Box
A thought-leader in cyber technology, Adam Mansour has over 15 years' experience spanning endpoint, network and cloud systems security; audits and architecture; building and managing SOCs; and software development. He is the creator of the IntelliGO Managed Detection and Response platform, acquired by ActZero. When you hear cybersecurity firm Darktrace's customers talk about their experience with the company, they will tell you about'the box' from Darktrace they installed. The idea behind the box is that it allows you to see malicious network traffic and coordinate to the cloud directly so you can react quickly. The main customer feedback is that the box was pretty and showed them lots of nice graphics -- beautiful network maps, gorgeous matrixes, pipe diagrams.
Address AI Bias with Fairness Criteria & Tools
AI biases are common, persistent, and hard to address. We wish people see what AI can do but not its flaws. But this is like driving a Lamborghini with the check engine light on. It may run fine for the next few weeks but accidents are waiting to happen. To address the problem, we need to know what is fairness. Can it be judged or evaluated? In the previous article, we look at the complexity of AI bias. All AI designs need to follow the laws if applicable. In this section, we will discuss these issues. Sensitive characters are bias factors that are practically or morally irrelevant to a decision.
ML.NET (Hands-On Machine Learning with ML.NET)
After adding the dataset to our project, we have to create now the input class for our model. Therefore we are going to add a new class in the subfolder "Objects" called SentimentData. Column 0 represents our input text, while column 1 stands for the output label. So far, we have created the input class for the model, but we also need an output class, which contains the output properties after running the model. Let's create a class called SentimentPrediction in the same Objects folder as our SentimentData class.
Titanic Predictions with LDA
The titanic is one of the most iconic and at the same time saddest stories in the history of human beings. There are barely any individuals who are not familiar with its story and how lucky some people were on that liner, because of certain characteristics that they took with them. Whether they were kids or had a higher purchasing power, there was a pattern to follow when predicting the probability of getting a safe boat, leaving unharmed the ship. The cleaning of the data is by far the most challenging part in most of the machine learning projects since you can extremely improve (or harm) your model according to the individual features and the types of features you train your model with. For feature selection, we will go through three main aspects.
Dimensionality Reduction of Longitudinal 'Omics Data using Modern Tensor Factorization
Mor, Uria, Cohen, Yotam, Valdes-Mas, Rafael, Kviatcovsky, Denise, Elinav, Eran, Avron, Haim
Precision medicine is a clinical approach for disease prevention, detection and treatment, which considers each individual's genetic background, environment and lifestyle. The development of this tailored avenue has been driven by the increased availability of omics methods, large cohorts of temporal samples, and their integration with clinical data. Despite the immense progression, existing computational methods for data analysis fail to provide appropriate solutions for this complex, high-dimensional and longitudinal data. In this work we have developed a new method termed TCAM, a dimensionality reduction technique for multi-way data, that overcomes major limitations when doing trajectory analysis of longitudinal omics data. Using real-world data, we show that TCAM outperforms traditional methods, as well as state-of-the-art tensor-based approaches for longitudinal microbiome data analysis. Moreover, we demonstrate the versatility of TCAM by applying it to several different omics datasets, and the applicability of it as a drop-in replacement within straightforward ML tasks.