Large Language Model
Improved Logical Reasoning of Language Models via Differentiable Symbolic Programming
Zhang, Hanlin, Huang, Jiani, Li, Ziyang, Naik, Mayur, Xing, Eric
Pre-trained large language models (LMs) struggle to perform logical reasoning reliably despite advances in scale and compositionality. In this work, we tackle this challenge through the lens of symbolic programming. We propose DSR-LM, a Differentiable Symbolic Reasoning framework where pre-trained LMs govern the perception of factual knowledge, and a symbolic module performs deductive reasoning. In contrast to works that rely on hand-crafted logic rules, our differentiable symbolic reasoning framework efficiently learns weighted rules and applies semantic loss to further improve LMs. DSR-LM is scalable, interpretable, and allows easy integration of prior knowledge, thereby supporting extensive symbolic programming to robustly derive a logical conclusion. The results of our experiments suggest that DSR-LM improves the logical reasoning abilities of pre-trained language models, resulting in a significant increase in accuracy of over 20% on deductive reasoning benchmarks. Furthermore, DSR-LM outperforms a variety of competitive baselines when faced with systematic changes in sequence length.
Adapting Transformer Language Models for Predictive Typing in Brain-Computer Interfaces
Brain-computer interfaces (BCI) are an important mode of alternative and augmentative communication for many people. Unlike keyboards, many BCI systems do not display even the 26 letters of English at one time, let alone all the symbols in more complex systems. Using language models to make character-level predictions, therefore, can greatly speed up BCI typing (Ghosh and Kristensson, 2017). While most existing BCI systems employ character n-gram models or no LM at all, this paper adapts several wordpiece-level Transformer LMs to make character predictions and evaluates them on typing tasks. GPT-2 fares best on clean text, but different LMs react differently to noisy histories. We further analyze the effect of character positions in a word and context lengths.
Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation
Das, Nilaksh, Sunkara, Monica, Bodapati, Sravan, Cai, Jinglun, Kulshreshtha, Devang, Farris, Jeff, Kirchhoff, Katrin
End-to-end ASR models trained on large amount of data tend to be implicitly biased towards language semantics of the training data. Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models such as attention-based encoder-decoder and RNN-T. Typically, ILME is performed by modularizing the acoustic and language components of the model architecture, and eliminating the acoustic input to perform log-linear interpolation with the text-only posterior. However, for CTC-based ASR, it is not as straightforward to decouple the model into such acoustic and language components, as CTC log-posteriors are computed in a non-autoregressive manner. In this work, we propose a novel ILME technique for CTC-based ASR models. Our method iteratively masks the audio timesteps to estimate a pseudo log-likelihood of the internal LM by accumulating log-posteriors for only the masked timesteps. Extensive evaluation across multiple out-of-domain datasets reveals that the proposed approach improves WER by up to 9.8% and OOV F1-score by up to 24.6% relative to Shallow Fusion, when only text data from target domain is available. In the case of zero-shot domain adaptation, with no access to any target domain data, we demonstrate that removing the source domain bias with ILME can still outperform Shallow Fusion to improve WER by up to 9.3% relative.
On Contrastive Learning of Semantic Similarity forCode to Code Search
Saieva, Anthony, Chakraborty, Saikat, Kaiser, Gail
This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during training. We present the first-ever code search method that encodes dynamic runtime information during training without the need to execute either the corpus under search or the search query at inference time and the first code search technique that trains on both positive and negative reference samples. To validate the efficacy of our approach, we perform a set of studies demonstrating the capability of enhanced LLMs to perform cross-language code-to-code search. Our evaluation demonstrates that the effectiveness of our approach is consistent across various model architectures and programming languages. We outperform the state-of-the-art cross-language search tool by up to 44.7\%. Moreover, our ablation studies reveal that even a single positive and negative reference sample in the training process results in substantial performance improvements demonstrating both similar and dissimilar references are important parts of code search. Importantly, we show that enhanced well-crafted, fine-tuned models consistently outperform enhanced larger modern LLMs without fine tuning, even when enhancing the largest available LLMs highlighting the importance for open-sourced models. To ensure the reproducibility and extensibility of our research, we present an open-sourced implementation of our tool and training procedures called Cosco.
White House Warns of Risks as AI Use Takes Off
WASHINGTON--The Biden administration is confronting the rapidly expanding use of artificial intelligence, warning of the dangers the technology poses to public safety, privacy and democracy while having limited authority to regulate it. Vice President Kamala Harris, who met Thursday with the chief executives of leading AI companies Google, Microsoft, OpenAI and Anthropic, said the technology "has the potential to dramatically increase threats to safety and security, infringe civil rights and privacy, and erode public trust and faith in democracy."
These ChatGPT Rivals Are Designed to Play With Your Emotions
ChatGPT and its brethren are both surprisingly clever and disappointingly dumb. Sure, they can generate pretty poems, solve scientific puzzles, and debug spaghetti code. But we know that they often fabricate, forget, and act like weirdos. Inflection AI, a company founded by researchers who previously worked on major artificial intelligence projects at Google, OpenAI, and Nvidia, built a bot called Pi that seems to make fewer blunders and be more adept at sociable conversation. Inflection designed Pi to address some of the problems of today's chatbots.
White House lays out its AI damage control plan - and KAMALA HARRIS will be program's czar
The White House has unveiled its plan to crack down on the AI race amid growing concerns it could upend life as we know it. The Biden Administration said the technology was'one of the most powerful' of our time, adding: 'but in order to seize the opportunities it presents, we must first mitigate its risks.' The plan is to launch 25 research institutes across the US that will seek assurance from four companies, including Google, Microsoft and ChatGPT's creator OpenAI, which will'participate in a public evaluation.' Many of the world's best minds have warned about the dangers of AI, specifically that it could destroy humanity if an assessment of risk is not done now. Tech giants like Elon Musk fear AI will soon surpass human intelligence and has independent thinking.
Why I Decided to Let My Students Turn in Essays Written by a Machine
The writing sounded like the typical 3 a.m. It was the sort of paper that usually makes me wonder: Did this student even come to class? Did I communicate anything of any value to them at all? Except there were no obvious tells that this was the product of an all-nighter: no grammar errors, misspellings, or departures into the extraneous examples that seem profound to students late at night but definitely sound like the product of a bong hit in the light of day. Perhaps, just before the end of the semester, I was seeing my very first student essay written by ChatGPT?
Britain's competition watchdog opens investigation into artificial intelligence market
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Britain's competition watchdog said Thursday that it's opening a review of the artificial intelligence market, focusing on the technology underpinning chatbots like ChatGPT. The Competition Markets Authority said it will look into the opportunities and risks of AI as well as the competition rules and consumer protections that may be needed. SALES INDUSTRY'S'ALWAYS BE CLOSING' MANTRA COULD GET BOOST FROM AI The CEOs of Google, Microsoft and ChatGPT-maker OpenAI will meet Thursday with U.S. Vice President Kamala Harris for talks on how to ease the risks of their technology.
EU urged to protect grassroots AI research or risk losing out to US
The EU has been warned that it risks handing control of artificial intelligence to US tech firms if it does not act to protect grassroots research in its forthcoming AI bill. In an open letter coordinated by the German research group Laion, or Large-scale AI Open Network, the European parliament was told that "one-size-fits-all" rules risked eliminating open research and development. "Rules that require a researcher or developer to monitor or control downstream use could make it impossible to release open-source AI in Europe," which would "entrench large firms" and "hamper efforts to improve transparency, reduce competition, limit academic freedom, and drive investment in AI overseas", the letter says. It adds: "Europe cannot afford to lose AI sovereignty. Eliminating open-source R&D will leave the European scientific community and economy critically dependent on a handful of foreign and proprietary firms for essential AI infrastructure."