Goto

Collaborating Authors

 Large Language Model


Parameter-Efficient Abstractive Question Answering over Tables or Text

arXiv.org Artificial Intelligence

A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and generate natural answers to user queries. Today, memory intensive pre-trained language models are adapted to downstream tasks such as QA by fine-tuning the model on QA data in a specific modality like unstructured text or structured tables. To avoid training such memory-hungry models while utilizing a uniform architecture for each modality, parameter-efficient adapters add and train small task-specific bottle-neck layers between transformer layers. In this work, we study parameter-efficient abstractive QA in encoder-decoder models over structured tabular data and unstructured textual data using only 1.5% additional parameters for each modality. We also ablate over adapter layers in both encoder and decoder modules to study the efficiency-performance trade-off and demonstrate that reducing additional trainable parameters down to 0.7%-1.0% leads to comparable results. Our models out-perform current state-of-the-art models on tabular QA datasets such as Tablesum and FeTaQA, and achieve comparable performance on a textual QA dataset such as NarrativeQA using significantly less trainable parameters than fine-tuning.


Three former DeepMinders are developing A.I. to pick stocks and crypto

#artificialintelligence

Three former DeepMind employees are trying to train a machine to spot and invest in company stocks and cryptocurrencies before they rise. Martin Schmid, Rudolf Kadlec and Matej Moravcik left Alphabet-owned DeepMind in January to set up EquiLibre Technologies, relocating from Edmonton in Canada to Prague in the Czech Republic in the process. The trio all used to work at IBM and in 2017 they developed an AI called DeepStack. It became the first AI capable of beating professional poker players at heads-up no-limit Texas hold'em poker. Now they're looking to apply some of these concepts to financial markets.


Play Around With GitHub Copilot Through Visual Studio 2022 IDE

#artificialintelligence

GitHub Copilot is an AI tool created by GitHub and OpenAI to support programmers. As of today, GitHub Copilot is available as an extension for Neovim, JetBrains, and Visual Studio Code. Despite being a technical preview, it "does especially well for Python, JavaScript, TypeScript, Ruby, Java, and Go, but it understands dozens of languages and can help you find your way around almost anything", as reported on GitHub Copilot. Generally speaking, if you are into data science, web development, and more broadly software, you should try this out. Side note: I added a summary of the shortcuts at the end of the post.


Advancing sports analytics through AI research

#artificialintelligence

Creating testing environments to help progress AI research out of the lab and into the real world is immensely challenging. Given AI's long association with games, it is perhaps no surprise that sports presents an exciting opportunity, offering researchers a testbed in which an AI-enabled system can assist humans in making complex, real-time decisions in a multiagent environment with dozens of dynamic, interacting individuals. The rapid growth of sports data collection means we are in the midst of a remarkably important era for sports analytics. The availability of sports data is increasing in both quantity and granularity, transitioning from the days of aggregate high-level statistics and sabermetrics to more refined data such as event stream information (e.g., annotated passes or shots), high-fidelity player positional information, and on-body sensors. However, the field of sports analytics has only recently started to harness machine learning and AI for both understanding and advising human decision-makers in sports.


540-Billion Parameter NLP Model

#artificialintelligence

Recent years have seen the rapid development of NLP models. In 2020, GPT-3 was the first to show that large language models (LLMs) could be used for few-shot learning and achieve impressive results without extensive training or parameter updating.


CLIP: OpenAI's Multi-Modal Model

#artificialintelligence

How and why I got 75Gb of free foreign exchange "Tick" data. Understanding the Bias-Variance tradeoff at three different levels: simple, intermediate and advanced. Overlap is key to a good point cloud alignment.


Google-Backed Artificial Intelligence Taught to Control Nuclear Fusion Reactor - Watchman.Today

#artificialintelligence

DeepMind, the UK-based subsidiary of Alphabet, Google's parent company, has taught artificial intelligence how to control a nuclear fusion reactor. The company announced on Feb. 16 that it had used AI to successfully control superheated matter inside a nuclear fusion reactor, and their findings are detailed in a paper published in the journal, Nature. DeepMind whose long-term goal is to "solve intelligence, developing more general and capable problem-solving systems, known as artificial general intelligence (AGI)" was launched in 2010 and acquired by Google in 2014. The scientific discovery company collaborated with the nuclear fusion research lab, the Swiss Plasma Center at École Polytechnique Fédérale de Lausanne on the project. Together, they have "developed a new magnetic control method for plasmas based on deep reinforcement learning" which they applied to a real-world plasma for the first time in the SPC's tokamak research facility, called TCV.


MuZero's first step from research into the real world

#artificialintelligence

In 2016, we introduced AlphaGo, the first artificial intelligence program to defeat humans at the ancient game of Go. Its successors, AlphaZero and then MuZero, each represented a significant step forward in the pursuit of general-purpose algorithms, mastering a greater number of games with even less predefined knowledge. MuZero, for example, mastered Chess, Go, Shogi, and Atari without needing to be told the rules. But so far these agents have focused on solving games. Now, in pursuit of DeepMind's mission to solve intelligence, MuZero has taken a first step towards mastering a real-world task by optimising video on YouTube.


Jigsaw fixes bugs in machine-written software - Microsoft Research

#artificialintelligence

Large pre-trained language models such as GPT-3, Codex, and others can be tuned to generate code from natural language specifications of programmer intent. Such automated models have the potential to improve productivity for every programmer in the world. But since the models can struggle to understand program semantics, the quality of the resulting code can't be guaranteed. In our research paper, Jigsaw: Large Language Models meet Program Synthesis, which has been accepted at the International Conference on Software Engineering (ICSE 2022), we introduce a new tool that can improve the performance of these large language models. Jigsaw deploys post-processing techniques that understand the programs' syntax and semantics and then leverages user feedback to improve future performance.


Google関連企業でセクハラ被害を受けた女性が体制改革を求めて告発文を発表

#artificialintelligence

Googleの姉妹会社であるDeepMindは、自動プログラミングAI「AlphaCode」や超強力ボードゲームAI「AlphaZero」など高性能なAIを次々と発表している人工知能開発企業です。そんなDeepMindに在籍していた女性が上司による性的被害を告発し、親会社のAlphabetに体制改革を求めています。