Collaborating Authors


Cybersecurity Research for the Future

Communications of the ACM

The growth of myriad cyber-threats continues to accelerate, yet the stream of new and effective cyber-defense technologies has grown much more slowly. The gap between threat and defense has widened, as our adversaries deploy increasingly sophisticated attack technology and engage in cyber-crime with unprecedented power, resources, and global reach. We are in an escalating asymmetric cyber environment that calls for immediate action. The extension of cyber-attacks into the socio-techno realm and the use of cyber as an information influence and disinformation vector will continue to undermine our confidence in systems. The unknown is a growing threat in our cyber information systems.

Multi-Modal Depth Estimation Using Convolutional Neural Networks Artificial Intelligence

This paper addresses the problem of dense depth predictions from sparse distance sensor data and a single camera image on challenging weather conditions. This work explores the significance of different sensor modalities such as camera, Radar, and Lidar for estimating depth by applying Deep Learning approaches. Although Lidar has higher depth-sensing abilities than Radar and has been integrated with camera images in lots of previous works, depth estimation using CNN's on the fusion of robust Radar distance data and camera images has not been explored much. In this work, a deep regression network is proposed utilizing a transfer learning approach consisting of an encoder where a high performing pre-trained model has been used to initialize it for extracting dense features and a decoder for upsampling and predicting desired depth. The results are demonstrated on Nuscenes, KITTI, and a Synthetic dataset which was created using the CARLA simulator. Also, top-view zoom-camera images captured from the crane on a construction site are evaluated to estimate the distance of the crane boom carrying heavy loads from the ground to show the usability in safety-critical applications.

RainBench: Towards Global Precipitation Forecasting from Satellite Imagery Artificial Intelligence

Extreme precipitation events, such as violent rainfall and hail storms, routinely ravage economies and livelihoods around the developing world. Climate change further aggravates this issue. Data-driven deep learning approaches could widen the access to accurate multi-day forecasts, to mitigate against such events. However, there is currently no benchmark dataset dedicated to the study of global precipitation forecasts. In this paper, we introduce \textbf{RainBench}, a new multi-modal benchmark dataset for data-driven precipitation forecasting. It includes simulated satellite data, a selection of relevant meteorological data from the ERA5 reanalysis product, and IMERG precipitation data. We also release \textbf{PyRain}, a library to process large precipitation datasets efficiently. We present an extensive analysis of our novel dataset and establish baseline results for two benchmark medium-range precipitation forecasting tasks. Finally, we discuss existing data-driven weather forecasting methodologies and suggest future research avenues.

How AWS's five tenets of innovation lend themselves to machine learning


As machine learning disrupts more and more industries, it has demonstrated its potential to reduce time spent by employees on manual tasks. However, training machine learning models can take months to achieve, creating excessive costs. With this in mind, AWS vice-president of machine learning, Swami Sivasubramanian used his keynote speech at AWS re:Invent to announce new tools that aim to speed up operations and save costs. Sivasubramanian went through five tenets for machine learning that AWS observes, which acted as vessels for further explanations of use cases for the new tools. Firstly, Sivasubramanian explained the importance of providing firm foundations, vital for freedom of creativity.

How to use Machine Learning for Anomaly Detection and Conditional Monitoring - KDnuggets


Before doing any data analysis, the need to find out any outliers in a dataset arises. These outliers are known as anomalies. This article explains the goals of anomaly detection and outlines the approaches used to solve specific use cases for anomaly detection and condition monitoring. The main goal of Anomaly Detection analysis is to identify the observations that do not adhere to general patterns considered as normal behavior. For instance, Figure 1 shows anomalies in the classification and regression problems.

Amazon Web Services launches new tool to detect bias and blind spots in machine learning


A new feature from Amazon Web Services will alert developers to potential bias in machine learning algorithms, part of a larger effort by the tech industry to keep automated predictions from discriminating against women, people of color and other underrepresented groups. The feature, SageMaker Clarify, was announced at the AWS re:Invent conference Tuesday as a new component of the AWS SageMaker machine learning platform. The technology analyzes the data used to train machine learning models for telltale signs of bias, including data sets that don't accurately reflect the larger population. It also analyzes the machine learning model itself to help ensure the accuracy of the resulting predictions. A 2018 MIT study found that the presence of a disproportionate number of white males in data sets used to train facial recognition algorithms led a larger number of errors in recognizing women and people of color.

11 Data Science Myths


Python or R – which tool should you learn? If I got a penny each time I came across this question.. There is a widely held belief that mastering data science is about learning how to apply techniques in Python or R. Or any other tool. That tool has become the central point around which all other data science functions revolve. The assumption (or myth) is that being able to write code using existing libraries (numpy, scikit-learn, caret, etc.) should be enough to label yourself an expert.

Neurosymbolic AI: The 3rd Wave Artificial Intelligence

Current advances in Artificial Intelligence (AI) and Machine Learning (ML) have achieved unprecedented impact across research communities and industry. Nevertheless, concerns about trust, safety, interpretability and accountability of AI were raised by influential thinkers. Many have identified the need for well-founded knowledge representation and reasoning to be integrated with deep learning and for sound explainability. Neural-symbolic computing has been an active area of research for many years seeking to bring together robust learning in neural networks with reasoning and explainability via symbolic representations for network models. In this paper, we relate recent and early research results in neurosymbolic AI with the objective of identifying the key ingredients of the next wave of AI systems. We focus on research that integrates in a principled way neural network-based learning with symbolic knowledge representation and logical reasoning. The insights provided by 20 years of neural-symbolic computing are shown to shed new light onto the increasingly prominent role of trust, safety, interpretability and accountability of AI. We also identify promising directions and challenges for the next decade of AI research from the perspective of neural-symbolic systems.

Open Problems in Cooperative AI Artificial Intelligence

Problems of cooperation--in which agents seek ways to jointly improve their welfare--are ubiquitous and important. They can be found at scales ranging from our daily routines--such as driving on highways, scheduling meetings, and working collaboratively--to our global challenges--such as peace, commerce, and pandemic preparedness. Arguably, the success of the human species is rooted in our ability to cooperate. Since machines powered by artificial intelligence are playing an ever greater role in our lives, it will be important to equip them with the capabilities necessary to cooperate and to foster cooperation. We see an opportunity for the field of artificial intelligence to explicitly focus effort on this class of problems, which we term Cooperative AI. The objective of this research would be to study the many aspects of the problems of cooperation and to innovate in AI to contribute to solving these problems. Central goals include building machine agents with the capabilities needed for cooperation, building tools to foster cooperation in populations of (machine and/or human) agents, and otherwise conducting AI research for insight relevant to problems of cooperation. This research integrates ongoing work on multi-agent systems, game theory and social choice, human-machine interaction and alignment, natural-language processing, and the construction of social tools and platforms. However, Cooperative AI is not the union of these existing areas, but rather an independent bet about the productivity of specific kinds of conversations that involve these and other areas. We see opportunity to more explicitly focus on the problem of cooperation, to construct unified theory and vocabulary, and to build bridges with adjacent communities working on cooperation, including in the natural, social, and behavioural sciences.

Distant-Supervised Slot-Filling for E-Commerce Queries Artificial Intelligence

Slot-filling refers to the task of annotating individual terms in a query with the corresponding intended product characteristics (product type, brand, gender, size, color, etc.). These characteristics can then be used by a search engine to return results that better match the query's product intent. Traditional methods for slot-filling require the availability of training data with ground truth slot-annotation information. However, generating such labeled data, especially in e-commerce is expensive and time-consuming because the number of slots increases as new products are added. In this paper, we present distant-supervised probabilistic generative models, that require no manual annotation. The proposed approaches leverage the readily available historical query logs and the purchases that these queries led to, and also exploit co-occurrence information among the slots in order to identify intended product characteristics. We evaluate our approaches by considering how they affect retrieval performance, as well as how well they classify the slots. In terms of retrieval, our approaches achieve better ranking performance (up to 156%) over Okapi BM25. Moreover, our approach that leverages co-occurrence information leads to better performance than the one that does not on both the retrieval and slot classification tasks.