Accuracy
Algorithmic encoding of protected characteristics and its implications on disparities across subgroups
It has been rightfully emphasized that the use of AI for clinical decision making could amplify health disparities. A machine learning model may pick up undesirable correlations, for example, between a patient's racial identity and clinical outcome. Such correlations are often present in (historical) data used for model development. There has been an increase in studies reporting biases in disease detection models across patient subgroups. Besides the scarcity of data from underserved populations, very little is known about how these biases are encoded and how one may reduce or even remove disparate performance. There is some speculation whether algorithms may recognize patient characteristics such as biological sex or racial identity, and then directly or indirectly use this information when making predictions. But it remains unclear how we can establish whether such information is actually used. This article aims to shed some light on these issues by exploring new methodology allowing intuitive inspections of the inner working of machine learning models for image-based detection of disease. We also evaluate an effective yet debatable technique for addressing disparities leveraging the automatic prediction of patient characteristics, resulting in models with comparable true and false positive rates across subgroups. Our findings may stimulate the discussion about safe and ethical use of AI.
Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance
Lim, Justin, Ji, Christina X, Oberst, Michael, Blecker, Saul, Horwitz, Leora, Sontag, David
Individuals often make different decisions when faced with the same context, due to personal preferences and background. For instance, judges may vary in their leniency towards certain drug-related offenses, and doctors may vary in their preference for how to start treatment for certain types of patients. With these examples in mind, we present an algorithm for identifying types of contexts (e.g., types of cases or patients) with high inter-decision-maker disagreement. We formalize this as a causal inference problem, seeking a region where the assignment of decision-maker has a large causal effect on the decision. Our algorithm finds such a region by maximizing an empirical objective, and we give a generalization bound for its performance. In a semi-synthetic experiment, we show that our algorithm recovers the correct region of heterogeneity accurately compared to baselines. Finally, we apply our algorithm to real-world healthcare datasets, recovering variation that aligns with existing clinical knowledge.
Researchers think mysterious radio signal that might have been a sign of aliens is 'false positive'
In 1996 Nasa and the White House made the explosive announcement that the rock contained traces of Martian bugs. The meteorite, catalogued as Allen Hills (ALH) 84001, crashed onto the frozen wastes of Antarctica 13,000 years ago and was recovered in 1984. Photographs were released showing elongated segmented objects that appeared strikingly lifelike.
WWE releases 2022 pay-per-view schedule
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. WWE is ready for 2022. The pro wrestling company released its pay-per-view schedule for the next year with two more shows left on the docket for the year, Survivor Series in November and TLC: Tables Ladders & Chairs in December. MIAMI GARDENS, FL - APRIL 1: John Cena looks on before his match against Dwayne ''The Rock'' Johnson during WrestleMania XXVIII at Sun Life Stadium on April 1, 2012 in Miami Gardens, Florida.
Revisiting Process versus Product Metrics: a Large Scale Analysis
Majumder, Suvodeep, Mody, Pranav, Menzies, Tim
Numerous methods can build predictive models from software data. However, what methods and conclusions should we endorse as we move from analytics in-the-small (dealing with a handful of projects) to analytics in-the-large (dealing with hundreds of projects)? To answer this question, we recheck prior small-scale results (about process versus product metrics for defect prediction and the granularity of metrics) using 722,471 commits from 700 Github projects. We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large. For example, like prior work, we see that process metrics are better predictors for defects than product metrics (best process/product-based learners respectively achieve recalls of 98\%/44\% and AUCs of 95\%/54\%, median values). That said, we warn that it is unwise to trust metric importance results from analytics in-the-small studies since those change dramatically when moving to analytics in-the-large. Also, when reasoning in-the-large about hundreds of projects, it is better to use predictions from multiple models (since single model predictions can become confused and exhibit a high variance).
Bootstrapping Concept Formation in Small Neural Networks
Tamosiunaite, Minija, Kulvicius, Tomas, Wörgötter, Florentin
The question how neural systems (of humans) can perform reasoning is still far from being solved. We posit that the process of forming Concepts is a fundamental step required for this. We argue that, first, Concepts are formed as closed representations, which are then consolidated by relating them to each other. Here we present a model system (agent) with a small neural network that uses realistic learning rules and receives only feedback from the environment in which the agent performs virtual actions. First, the actions of the agent are reflexive. In the process of learning, statistical regularities in the input lead to the formation of neuronal pools representing relations between the entities observed by the agent from its artificial world. This information then influences the behavior of the agent via feedback connections replacing the initial reflex by an action driven by these relational representations. We hypothesize that the neuronal pools representing relational information can be considered as primordial Concepts, which may in a similar way be present in some pre-linguistic animals, too. We argue that systems such as this can help formalizing the discussion about what constitutes Concepts and serve as a starting point for constructing artificial cogitating systems.
Task-Aware Meta Learning-based Siamese Neural Network for Classifying Obfuscated Malware
Zhu, Jinting, Jang-Jaccard, Julian, Singh, Amardeep, Watters, Paul A., Camtepe, Seyit
Malware authors apply different obfuscation techniques on the generic feature of malware (i.e., unique malware signature) to create new variants to avoid detection. Existing Siamese Neural Network (SNN) based malware detection methods fail to correctly classify different malware families when similar generic features are shared across multiple malware variants resulting in high false-positive rates. To address this issue, we propose a novel Task-Aware Meta Learning-based Siamese Neural Network resilient against obfuscated malware while able to detect malware trained with one or a few training samples. Using entropy features of each malware signature alongside image features as task inputs, our task-aware meta leaner generates the parameters for the feature layers to more accurately adjust the feature embedding for different malware families. In addition, our model utilizes meta-learning with the extracted features of a pre-trained network (e.g., VGG-16) to avoid the bias typically associated with a model trained with a limited number of training samples. Our proposed approach is highly effective in recognizing unique malware signatures, thus correctly classifying malware samples that belong to the same malware family even in the presence of obfuscation technique applied to malware. Our experimental results, validated with N-way on N-shot learning, show that our model is highly effective in classification accuracy exceeding the rate>91% compared to other similar methods.
Intel open-sources ControlFlag tool to find errors in code
Intel Labs has big plans for a software tool called ControlFlag that uses artificial intelligence to scan through code and pick out errors. One of those goals, perhaps way out in the future, is to bake it into chip packages as a last line of defense against faulty code. This could make the information flow on communications channels safer and efficient. Last week Intel open-sourced the tool – dubbed ControlFlag – to software developers. The software pores over lines of code and points out errors that developers can then fix.
How to Measure the Success of a Recommendation System?
Recommender systems are used in a variety of domains, from e-commerce to social media to offer personalized recommendations to customers. The benefit of recommendations for customers, such as reduced information overload, has been a hot topic of research. However, it's unclear how and to what extent recommender systems produce commercial value. It's challenging to create a reliable product suggestion system. However, defining what it means to be reliable is also a challenging task.
Where were my keys? -- Aggregating Spatial-Temporal Instances of Objects for Efficient Retrieval over Long Periods of Time
Idrees, Ifrah, Hasan, Zahid, Reiss, Steven P., Tellex, Stefanie
Robots equipped with situational awareness can help humans efficiently find their lost objects by leveraging spatial and temporal structure. Existing approaches to video and image retrieval do not take into account the unique constraints imposed by a moving camera with a partial view of the environment. We present a Detection-based 3-level hierarchical Association approach, D3A, to create an efficient query-able spatial-temporal representation of unique object instances in an environment. D3A performs online incremental and hierarchical learning to identify keyframes that best represent the unique objects in the environment. These keyframes are learned based on both spatial and temporal features and once identified their corresponding spatial-temporal information is organized in a key-value database. D3A allows for a variety of query patterns such as querying for objects with/without the following: 1) specific attributes, 2) spatial relationships with other objects, and 3) time slices. For a given set of 150 queries, D3A returns a small set of candidate keyframes (which occupy only 0.17% of the total sensory data) with 81.98\% mean accuracy in 11.7 ms. This is 47x faster and 33% more accurate than a baseline that naively stores the object matches (detections) in the database without associating spatial-temporal information.