Africa
Your Algorithm Hates You
What can we do about algorithmic bias? If you're a software developer or data scientist, IBM Research has an open source toolkit that helps you check bias in your data models. But it's not just technologists who can do something about algorithmic bias. You can start reclaiming digital space by exploring your choice in technology services. For example, by using search engines like DuckDuckGo, because unlike the voracious data vampire that is Google, it doesn't store your personal information to then use for targeted ads. You can also petition and lobby your government to adopt a governance framework for algorithmic accountability and transparency policy where "Algorithmic literacy" is introduced into curricular, and standardised notifications (to communicate type and degree of algorithmic processing in decisions) are made a requirement. Ultimately, we need to ask more of ourselves and tech companies. It's not enough to just employ critical thinking – we also need to employ civic thinking in how we build and use these technologies. This article first appeared in The Daily Maverick.
Refined Generalization Analysis of Gradient Descent for Over-parameterized Two-layer Neural Networks with Smooth Activations on Classification Problems
Nitanda, Atsushi, Suzuki, Taiji
Recently, several studies have proven the global convergence and generalization abilities of the gradient descent method for two-layer ReLU networks by making a positivity assumption of the Gram-matrix of the neural tangent kernel. However, the performance of gradient descent on classification problems has not been well studied, and further investigation of the problem structure is possible. In this work, we present a partially stronger but reasonable assumption for binary classification problems compared to the positivity assumption of the Gram-matrix, where a data distribution can be perfectly classifiable by a tangent model, and we provide a refined generalization analysis of the gradient descent method for two-layer networks with smooth activations. A remarkable point of this study is that our generalization bound has much better dependence on the network width compared to existing results. As a result, our theory significantly enlarges a class of over-parameterized networks having provable generalization ability, with respect to network width, while most studies require much higher over-parameterization.
Compositional Questions Do Not Necessitate Multi-hop Reasoning
Min, Sewon, Wallace, Eric, Singh, Sameer, Gardner, Matt, Hajishirzi, Hannaneh, Zettlemoyer, Luke
Multi-hop reading comprehension (RC) questions are challenging because they require reading and reasoning over multiple paragraphs. We argue that it can be difficult to construct large multi-hop RC datasets. For example, even highly compositional questions can be answered with a single hop if they target specific entity types, or the facts needed to answer them are redundant. Our analysis is centered on HotpotQA, where we show that single-hop reasoning can solve much more of the dataset than previously thought. We introduce a single-hop BERT-based RC model that achieves 67 F1---comparable to state-of-the-art multi-hop models. We also design an evaluation setting where humans are not shown all of the necessary paragraphs for the intended multi-hop reasoning but can still answer over 80% of questions. Together with detailed error analysis, these results suggest there should be an increasing focus on the role of evidence in multi-hop reasoning and possibly even a shift towards information retrieval style evaluations with large and diverse evidence collections.
New algorithm may help people store more pictures, share videos faster
The world produces about 2.5 quintillion bytes of data every day. Storing and transferring all of this enormous--and constantly growing--number of images, videos, Tweets, and other forms of data is becoming a significant challenge, one that threatens to undermine the growth of the internet and thwart the introduction of new technologies, such as the Internet of Things. Now, a team of researchers reports that an algorithm that uses a machine learning technique based on the human brain could ease that data clog by reducing the size of multimedia files, such as videos and images, and restoring them without losing much quality or information. Machine learning is a type of artificial intelligence, or AI. In a study, the researchers developed an algorithm that features a recurrent neural network to compress and restore data, according to C. Lee Giles, David Reese Professor of Information Sciences and Technology, Penn State, and an Institute for CyberScience associate.
Machine Learning and Visualization in Clinical Decision Support: Current State and Future Directions
Levy-Fix, Gal, Kuperman, Gilad J., Elhadad, Noémie
Deep learning, an area of machine learning, is set to revolutionize patient care. But it is not yet part of standard of care, especially when it comes to individual patient care. In fact, it is unclear to what extent data-driven techniques are being used to support clinical decision making (CDS). Heretofore, there has not been a review of ways in which research in machine learning and other types of data-driven techniques can contribute effectively to clinical care and the types of support they can bring to clinicians. In this paper, we consider ways in which two data driven domains - machine learning and data visualizations - can contribute to the next generation of clinical decision support systems. We review the literature regarding the ways heuristic knowledge, machine learning, and visualization are - and can be - applied to three types of CDS. There has been substantial research into the use of predictive modeling for alerts, however current CDS systems are not utilizing these methods. Approaches that leverage interactive visualizations and machine-learning inferences to organize and review patient data are gaining popularity but are still at the prototype stage and are not yet in use. CDS systems that could benefit from prescriptive machine learning (e.g., treatment recommendations for specific patients) have not yet been developed. We discuss potential reasons for the lack of deployment of data-driven methods in CDS and directions for future research.
Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models
Oberst, Michael, Sontag, David
We introduce an off-policy evaluation procedure for highlighting episodes where applying a reinforcement learned (RL) policy is likely to have produced a substantially different outcome than the observed policy. In particular, we introduce a class of structural causal models (SCMs) for generating counterfactual trajectories in finite partially observable Markov Decision Processes (POMDPs). We see this as a useful procedure for off-policy "debugging" in high-risk settings (e.g., healthcare); by decomposing the expected difference in reward between the RL and observed policy into specific episodes, we can identify episodes where the counterfactual difference in reward is most dramatic. This in turn can be used to facilitate review of specific episodes by domain experts. We demonstrate the utility of this procedure with a synthetic environment of sepsis management.
Global Media Forum: Can Artificial Intelligence truly be creative?
Just like the way human beings can draw, paint, sing, dance, recite poems and do other creative work, there is an understanding that machines powered by some of the latest technologies could possibly do the same perfectly. Artificial Intelligence (AI) is uniquely billed as one of those emerging technologies that will power machines to just do that. But answers to questions of how truly creative these machines can be are still varied and at some extent not sufficient. There are already concerns around the integrity of tech machines; how empathetic they can be, how emotional they can get along with existing humans without offending them, and so on. There is a reality already.
Automated Speech Generation from UN General Assembly Statements: Mapping Risks in AI Generated Texts
Bullock, Joseph, Luengo-Oroz, Miguel
Automated text generation has been applied broadly in many domains such as marketing and robotics, and used to create chatbots, product reviews and write poetry. The ability to synthesize text, however, presents many potential risks, while access to the technology required to build generative models is becoming increasingly easy. This work is aligned with the efforts of the United Nations and other civil society organisations to highlight potential political and societal risks arising through the malicious use of text generation software, and their potential impact on human rights. As a case study, we present the findings of an experiment to generate remarks in the style of political leaders by fine-tuning a pretrained AWD- LSTM model on a dataset of speeches made at the UN General Assembly. This work highlights the ease with which this can be accomplished, as well as the threats of combining these techniques with other technologies.
Government Artificial Intelligence Readiness Index 2019: How Did Frontier Markets Perform?
The Government Artificial Intelligence (AI) Readiness Index, compiled by Oxford Insights and the International Development Research Centre, ranks the governments of 194 nations according to how prepared they are to utilise AI in the provision of public services. According to global consulting firm PriceWaterhouseCooper, AI technologies are forecast to add an additional $15.7 trillion to the global economy by 2030, with $6.6 trillion to come from an increase in productivity and $9.1 trillion from consumption-side effects. The score that Oxford Insights provides for each country comprises of 11 input metrics grouped under four high-level topics: governance; infrastructure and data; skills and education; and government public services. On a global level, the top ranking countries (and their scores) were: Singapore (9.186), The likes of India (7.515) and China (7.37) were ranked 17th and 20th respectively.