Goto

Collaborating Authors

 Memory-Based Learning


A Data-driven Case-based Reasoning in Bankruptcy Prediction

arXiv.org Artificial Intelligence

There has been intensive research regarding machine learning models for predicting bankruptcy in recent years. However, the lack of interpretability limits their growth and practical implementation. This study proposes a data-driven explainable case-based reasoning (CBR) system for bankruptcy prediction. Empirical results from a comparative study show that the proposed approach performs superior to existing, alternative CBR systems and is competitive with state-of-the-art machine learning models. We also demonstrate that the asymmetrical feature similarity comparison mechanism in the proposed CBR system can effectively capture the asymmetrically distributed nature of financial attributes, such as a few companies controlling more cash than the majority, hence improving both the accuracy and explainability of predictions. In addition, we delicately examine the explainability of the CBR system in the decision-making process of bankruptcy prediction. While much research suggests a trade-off between improving prediction accuracy and explainability, our findings show a prospective research avenue in which an explainable model that thoroughly incorporates data attributes by design can reconcile the dilemma.


Using machine learning to improve all-in-one miniature spectrometers

AIHub

An international team of researchers have designed a miniaturised spectrometer with high resolution, employing machine learning methodology as one of their tools. The results are reported in the journal Science. Traditionally, spectrometers rely on bulky components to filter and disperse light. In addition, these traditional spectrometers are heavy and large, which limits their application in portable and mobile devices. Modern approaches simplify these components to shrink footprints, but tend to suffer from limited resolution and bandwidth.


Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks

arXiv.org Artificial Intelligence

Memorization presents a challenge for several constrained Natural Language Generation (NLG) tasks such as Neural Machine Translation (NMT), wherein the proclivity of neural models to memorize noisy and atypical samples reacts adversely with the noisy (web crawled) datasets. However, previous studies of memorization in constrained NLG tasks have only focused on counterfactual memorization, linking it to the problem of hallucinations. In this work, we propose a new, inexpensive algorithm for extractive memorization (exact training data generation under insufficient context) in constrained sequence generation tasks and use it to study extractive memorization and its effects in NMT. We demonstrate that extractive memorization poses a serious threat to NMT reliability by qualitatively and quantitatively characterizing the memorized samples as well as the model behavior in their vicinity. Based on empirical observations, we develop a simple algorithm which elicits non-memorized translations of memorized samples from the same model, for a large fraction of such samples. Finally, we show that the proposed algorithm could also be leveraged to mitigate memorization in the model through finetuning. We have released the code to reproduce our results at https://github.com/vyraun/Finding-Memo.


Reducing Training Sample Memorization in GANs by Training with Memorization Rejection

arXiv.org Artificial Intelligence

Generative adversarial network (GAN) continues to be a popular research direction due to its high generation quality. It is observed that many state-of-the-art GANs generate samples that are more similar to the training set than a holdout testing set from the same distribution, hinting some training samples are implicitly memorized in these models. This memorization behavior is unfavorable in many applications that demand the generated samples to be sufficiently distinct from known samples. Nevertheless, it is unclear whether it is possible to reduce memorization without compromising the generation quality. In this paper, we propose memorization rejection, a training scheme that rejects generated samples that are near-duplicates of training samples during training. Our scheme is simple, generic and can be directly applied to any GAN architecture. Experiments on multiple datasets and GAN models validate that memorization rejection effectively reduces training sample memorization, and in many cases does not sacrifice the generation quality. Code to reproduce the experiment results can be found at $\texttt{https://github.com/jybai/MRGAN}$.


[100%OFF] IBM Watson Beginners Training For AI

#artificialintelligence

When we include the unprecedented computing power offered by the cloud, it's clear we are living in an exciting era for building applications. When IBM Watson defeated the two Jeopardy champions back in 2011, it opened a new era in the practical application of Artificial Intelligence technology and contributed to the growing research and interest in this field. IBM Watson has evolved from being a game show winning question & answering computer system to a set of enterprise-grade artificial intelligence (AI) application program interfaces (API) available on IBM Cloud. These Watson APIs can ingest, understand & analyze all forms of data, allow for natural forms of interactions with people, learn, reason – all at a scale that allows for business processes and applications to be reimagined. This course is intended for business and technical users who want to learn more about the cognitive capabilities of IBM Watson Discovery service.


Mitigating Unintended Memorization in Language Models via Alternating Teaching

arXiv.org Artificial Intelligence

Recent research has shown that language models have a tendency to memorize rare or unique sequences in the training corpora which can thus leak sensitive attributes of user data. We employ a teacher-student framework and propose a novel approach called alternating teaching to mitigate unintended memorization in sequential modeling. In our method, multiple teachers are trained on disjoint training sets whose privacy one wishes to protect, and teachers' predictions supervise the training of a student model in an alternating manner at each time step. Experiments on LibriSpeech datasets show that the proposed method achieves superior privacy-preserving results than other counterparts. In comparison with no prevention for unintended memorization, the overall utility loss is small when training records are sufficient.


IBM's Watson is going to space

#artificialintelligence

IBM yesterday announced it would be providing the AI brain for a robot being built by Airbus to accompany astronauts aboard the International Space Station (ISS). When only the best of the best will do, it looks like Watson has the right stuff. The robot, which looks like a flying volleyball with a low-resolution face, is being deployed with German astronaut Alexander Gerst in June for a six month mission. It's called CIMON, an acronym for Crew Interactive Mobile Companion, and it's headed to space to do science stuff. It'll help crew members conduct medical experiments, study crystals, and play with a Rubix cube.


Legal Element-oriented Modeling with Multi-view Contrastive Learning for Legal Case Retrieval

arXiv.org Artificial Intelligence

Legal case retrieval, which aims to retrieve relevant cases given a query case, plays an essential role in the legal system. While recent research efforts improve the performance of traditional ad-hoc retrieval models, legal case retrieval is still challenging since queries are legal cases, which contain hundreds of tokens. Legal cases are much longer and more complicated than keywords queries. Apart from that, the definition of legal relevance is beyond the general definition. In addition to general topical relevance, the relevant cases also involve similar situations and legal elements, which can support the judgment of the current case. In this paper, we propose an interaction-focused network for legal case retrieval with a multi-view contrastive learning objective. The contrastive learning views, including case-view and element-view, aim to overcome the above challenges. The case-view contrastive learning minimizes the hidden space distance between relevant legal case representations produced by a pre-trained language model (PLM) encoder. The element-view builds positive and negative instances by changing legal elements of cases to help the network better compute legal relevance. To achieve this, we employ a legal element knowledge-aware indicator to detect legal elements of cases. We conduct extensive experiments on the benchmark of relevant case retrieval. Evaluation results indicate our proposed method obtains significant improvement over the existing methods.


Memorization and Generalization in Neural Code Intelligence Models

arXiv.org Artificial Intelligence

Deep Neural Networks (DNNs) are increasingly being used in software engineering and code intelligence tasks. These are powerful tools that are capable of learning highly generalizable patterns from large datasets through millions of parameters. At the same time, their large capacity can render them prone to memorizing data points. Recent work suggests that the memorization risk manifests especially strongly when the training dataset is noisy, involving many ambiguous or questionable samples, and memorization is the only recourse. The goal of this paper is to evaluate and compare the extent of memorization and generalization in neural code intelligence models. It aims to provide insights on how memorization may impact the learning behavior of neural models in code intelligence systems. To observe the extent of memorization in models, we add random noise to the original training dataset and use various metrics to quantify the impact of noise on various aspects of training and testing. We evaluate several state-of-the-art neural code intelligence models and benchmarks based on Java, Python, and Ruby codebases. Our results highlight important risks: millions of trainable parameters allow the neural networks to memorize anything, including noisy data, and provide a false sense of generalization. We observed all models manifest some forms of memorization. This can be potentially troublesome in most code intelligence tasks where they rely on rather noise-prone and repetitive data sources, such as code from GitHub. To the best of our knowledge, we provide the first study to quantify memorization effects in the domain of software engineering and code intelligence systems. This work raises awareness and provides new insights into important issues of training neural models in code intelligence systems that are usually overlooked by software engineering researchers.


eGain Connects with IBM Watson Assistant for Smarter Service

#artificialintelligence

The connector leverages eGain's unique BYOB (Bring Your Own Bot) architecture, allowing business users to easily plug in the Watson Assistant into the eGain platform with no coding. Per Gartner, less than 10% of customer service journeys are fulfilled using self-service, which is why it is critical to integrate chatbots with human-assisted service channels such as live chat. The eGain Connector for Watson Assistant improves customer, agent, and business experiences at once. When customers escalate from Watson to human-assisted chat, their context is passed to the contact center agent so that they do not need to repeat information to the agent. Agents get to see interactions that customers have already had with Watson before they start their conversation with the customer.