Goto

Collaborating Authors

 South America


What's That Smell?

#artificialintelligence

The new system, which runs artificial intelligence software on Intel's Loihi neuromorphic chip, is essentially an "electronic nose" that can learn the scent of a chemical from a single exposure. Researchers at Cornell University and Intel have developed artificial intelligence (AI) software that can learn the scent of a chemical with just one exposure, and then remember that scent forever. The software, which is designed to run most efficiently on an experimental chip from Intel known as Loihi, is so precise, it can even detect a scent that's masked by a number of other scents, according to researchers. Ultimately, the researchers hope to produce a market-ready solution that can detect hazardous substances in the air, sniff out dangerous drugs, discover hidden explosives, and assist with medical diagnoses. "Low-energy modules built around Loihi, running our algorithm, and hooked-up to diverse sensor arrays could be built into robots, medical analysis devices; for example, blood composition, hyperspectral processors, air quality sensors, food processing pipelines, you name it," says Thomas A. Cleland, a member of the research team and associate chair and professor of psychology at Cornell University. The system works by processing an input signal pattern for a scent drawn from an array of sensors, then recording that signal pattern in the AI software as a recognizable scent for future use.


smartcity OR smartcities_2020-07-23_17-36-00.xlsx

#artificialintelligence

The graph represents a network of 4,768 Twitter users whose tweets in the requested range contained "smartcity OR smartcities", or who were replied to or mentioned in those tweets. The network was obtained from the NodeXL Graph Server on Friday, 24 July 2020 at 00:48 UTC. The requested start date was Friday, 24 July 2020 at 00:01 UTC and the maximum number of days (going backward) was 14. The maximum number of tweets collected was 5,000. The tweets in the network were tweeted over the 2-day, 20-hour, 23-minute period from Monday, 20 July 2020 at 09:08 UTC to Thursday, 23 July 2020 at 05:32 UTC.


Language Models as Fact Checkers?

arXiv.org Artificial Intelligence

Recent work has suggested that language models (LMs) store both common-sense and factual knowledge learned from pre-training data. In this paper, we leverage this implicit knowledge to create an effective end-to-end fact checker using a solely a language model, without any external knowledge or explicit retrieval components. While previous work on extracting knowledge from LMs have focused on the task of open-domain question answering, to the best of our knowledge, this is the first work to examine the use of language models as fact checkers. In a closed-book setting, we show that our zero-shot LM approach outperforms a random baseline on the standard FEVER task, and that our fine-tuned LM compares favorably with standard baselines. Though we do not ultimately outperform methods which use explicit knowledge bases, we believe our exploration shows that this method is viable and has much room for exploration.


Multi-Armed Bandits for Minesweeper: Profiting from Exploration-Exploitation Synergy

arXiv.org Artificial Intelligence

A popular computer puzzle, the game of Minesweeper requires its human players to have a mix of both luck and strategy to succeed. Analyzing these aspects more formally, in our research we assessed the feasibility of a novel methodology based on Reinforcement Learning as an adequate approach to tackle the problem presented by this game. For this purpose we employed Multi-Armed Bandit algorithms which were carefully adapted in order to enable their use to define autonomous computational players, targeting to make the best use of some game peculiarities. After experimental evaluation, results showed that this approach was indeed successful, especially in smaller game boards, such as the standard beginner level. Despite this fact the main contribution of this work is a detailed examination of Minesweeper from a learning perspective, which led to various original insights which are thoroughly discussed.


Magellan

Communications of the ACM

Entity matching (EM) finds data instances that refer to the same real-world entity. In 2015, we started the Magellan project at UW-Madison, jointly with industrial partners, to build EM systems. Most current EM systems are stand-alone monoliths. In contrast, Magellan borrows ideas from the field of data science (DS), to build a new kind of EM systems, which is ecosystems of interoperable tools for multiple execution environments, such as on-premise, cloud, and mobile. This paper describes Magellan, focusing on the system aspects. We argue why EM can be viewed as a special class of DS problems and thus can benefit from system building ideas in DS. We discuss how these ideas have been adapted to build PyMatcher and CloudMatcher, sophisticated on-premise tools for power users and self-service cloud tools for lay users. These tools exploit techniques from the fields of machine learning, big data scaling, efficient user interaction, databases, and cloud systems. They have been successfully used in 13 companies and domain science groups, have been pushed into production for many customers, and are being commercialized. We discuss the lessons learned and explore applying the Magellan template to other tasks in data exploration, cleaning, and integration. Entity matching (EM) finds data instances that refer to the same real-world entity, such as tuples (David Smith, UW-Madison) and (D. Smith, UWM). This problem, also known as entity resolution, record linkage, deduplication, data matching, et cetera, has been a long-standing challenge in the database, AI, KDD, and Web communities.2,6 As data-driven applications proliferate, EM will become even more important. For example, to analyze raw data for insights, we often integrate multiple raw data sets into a single unified one, before performing the analysis, and such integration often requires EM. To build a knowledge graph, we often start with a small graph and then expand it with new data sets, and such expansion requires EM. When managing a data lake, we often use EM to establish semantic linkages among the disparate data sets in the lake.


Riiid raises $41.8 million to expand its AI test prep apps

#artificialintelligence

Riiid, a Seoul, South Korea-based startup developing AI test prep solutions, today closed a $41.8 million pre-series D financing round, bringing its total venture capital raised to date to $70.2 million. CEO YJ Jang says the funding will be used to advance Riiid's technology that offers personalized study solutions based on big data analysis, and to bolster the company's expansion across the U.S., South America, and the Middle East as it establishes an R&D lab -- Riiid Labs -- in Silicon Valley. The pandemic has forced the shutdown of schools in countries around the world; cramped indoor classrooms are seen as a major threat vector. Despite inequities with regard to internet access and the widening achievement gap, it's the belief of educators that the health pros outweigh the cons. Riiid, which offers its services exclusively online, has been a beneficiary of the shift.


Short-term forecasting of Amazon rainforest fires based on ensemble decomposition model

arXiv.org Machine Learning

Accurate forecasting is important for decision-makers. Recently, the Amazon rainforest is reaching record levels of the number of fires, a situation that concerns both climate and public health problems. Obtaining the desired forecasting accuracy becomes difficult and challenging. In this paper were developed a novel heterogeneous decomposition-ensemble model by using Seasonal and Trend decomposition based on Loess in combination with algorithms for short-term load forecasting multi-month-ahead, to explore temporal patterns of Amazon rainforest fires in Brazil. The results demonstrate the proposed decomposition-ensemble models can provide more accurate forecasting evaluated by performance measures. Diebold-Mariano statistical test showed the proposed models are better than other compared models, but it is statistically equal to one of them.


Improving LIME Robustness with Smarter Locality Sampling

arXiv.org Machine Learning

Explainability algorithms such as LIME have enabled machine learning systems to adopt transparency and fairness, which are important qualities in commercial use cases. However, recent work has shown that LIME's naive sampling strategy can be exploited by an adversary to conceal biased, harmful behavior. We propose to make LIME more robust by training a generative adversarial network to sample more realistic synthetic data which the explainer uses to generate explanations. Our experiments demonstrate that our proposed method demonstrates an increase in accuracy across three real-world datasets in detecting biased, adversarial behavior compared to vanilla LIME. This is achieved while maintaining comparable explanation quality, with up to 99.94\% in top-1 accuracy in some cases.


ADER: Adaptively Distilled Exemplar Replay Towards Continual Learning for Session-based Recommendation

arXiv.org Machine Learning

Session-based recommendation has received growing attention recently due to the increasing privacy concern. Despite the recent success of neural session-based recommenders, they are typically developed in an offline manner using a static dataset. However, recommendation requires continual adaptation to take into account new and obsolete items and users, and requires "continual learning" in real-life applications. In this case, the recommender is updated continually and periodically with new data that arrives in each update cycle, and the updated model needs to provide recommendations for user activities before the next model update. A major challenge for continual learning with neural models is catastrophic forgetting, in which a continually trained model forgets user preference patterns it has learned before. To deal with this challenge, we propose a method called Adaptively Distilled Exemplar Replay (ADER) by periodically replaying previous training samples (i.e., exemplars) to the current model with an adaptive distillation loss. Experiments are conducted based on the state-of-the-art SASRec model using two widely used datasets to benchmark ADER with several well-known continual learning techniques. We empirically demonstrate that ADER consistently outperforms other baselines, and it even outperforms the method using all historical data at every update cycle. This result reveals that ADER is a promising solution to mitigate the catastrophic forgetting issue towards building more realistic and scalable session-based recommenders.


On a Bernoulli Autoregression Framework for Link Discovery and Prediction

arXiv.org Machine Learning

We present a dynamic prediction framework for binary sequences that is based on a Bernoulli generalization of the auto-regressive process. Our approach lends itself easily to variants of the standard link prediction problem for a sequence of time dependent networks. Focusing on this dynamic network link prediction/recommendation task, we propose a novel problem that exploits additional information via a much larger sequence of auxiliary networks and has important real-world relevance. To allow discovery of links that do not exist in the available data, our model estimation framework introduces a regularization term that presents a trade-off between the conventional link prediction and this discovery task. In contrast to existing work our stochastic gradient based estimation approach is highly efficient and can scale to networks with millions of nodes. We show extensive empirical results on both actual product-usage based time dependent networks and also present results on a Reddit based data set of time dependent sentiment sequences.