Goto

Collaborating Authors

 marge



Pre-training via Paraphrasing

Neural Information Processing Systems

We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual multi-document paraphrasing objective. MARGE provides an alternative to the dominant masked language modeling paradigm, where we self-supervise the \emph{reconstruction} of target text by \emph{retrieving} a set of related texts (in many languages) and conditioning on them to maximize the likelihood of generating the original. We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization. The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks. For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation. We further show that fine-tuning gives strong performance on a range of discriminative and generative tasks in many languages, making MARGE the most generally applicable pre-training method to date.



MARGE: Improving Math Reasoning for LLMs with Guided Exploration

Gao, Jingyue, Lin, Runji, Lu, Keming, Yu, Bowen, Lin, Junyang, Chen, Jianyu

arXiv.org Artificial Intelligence

Large Language Models (LLMs) exhibit strong potential in mathematical reasoning, yet their effectiveness is often limited by a shortage of high-quality queries. This limitation necessitates scaling up computational responses through self-generated data, yet current methods struggle due to spurious correlated data caused by ineffective exploration across all reasoning stages. To address such challenge, we introduce \textbf{MARGE}: Improving \textbf{Ma}th \textbf{R}easoning with \textbf{G}uided \textbf{E}xploration, a novel method to address this issue and enhance mathematical reasoning through hit-guided exploration. MARGE systematically explores intermediate reasoning states derived from self-generated solutions, enabling adequate exploration and improved credit assignment throughout the reasoning process. Through extensive experiments across multiple backbone models and benchmarks, we demonstrate that MARGE significantly improves reasoning capabilities without requiring external annotations or training additional value models. Notably, MARGE improves both single-shot accuracy and exploration diversity, mitigating a common trade-off in alignment methods. These results demonstrate MARGE's effectiveness in enhancing mathematical reasoning capabilities and unlocking the potential of scaling self-generated training data. Our code and models are available at \href{https://github.com/georgao35/MARGE}{this link}.


Pre-training via Paraphrasing

Neural Information Processing Systems

We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual multi-document paraphrasing objective. MARGE provides an alternative to the dominant masked language modeling paradigm, where we self-supervise the \emph{reconstruction} of target text by \emph{retrieving} a set of related texts (in many languages) and conditioning on them to maximize the likelihood of generating the original. We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization. The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks. For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation.


Here's What Happens When AI Turns 'Simpsons' Characters Into Real People

#artificialintelligence

Milan Jaram describes himself as an artist who "horrifies your cozy childhood memories with dark twists on your favorite shows, cartoons and pop culture." With his AI-fication of Simpsons characters, Jaram has succeeded handily. AI Homer is not looking happy. A muscle-bound, rage-filled Homer looks ready to Hulk-smash his way through Springfield; Marge has morphed into the bride of Frankenstein; and Krusty the Clown's menacing face will be the nightmare of children and clown PR teams everywhere. Mischievous Bart and Millhouse, meanwhile, now have sad, hollow eyes drained of any youthful whimsy, and good-natured Ned Flanders looks totally defeated.


Let Your Heart Speak in its Mother Tongue: Multilingual Captioning of Cardiac Signals

Kiyasseh, Dani, Zhu, Tingting, Clifton, David

arXiv.org Artificial Intelligence

Cardiac signals, such as the electrocardiogram, convey a significant amount of information about the health status of a patient which is typically summarized by a clinician in the form of a clinical report, a cumbersome process that is prone to errors. To streamline this routine process, we propose a deep neural network capable of captioning cardiac signals; it receives a cardiac signal as input and generates a clinical report as output. We extend this further to generate multilingual reports. To that end, we create and make publicly available a multilingual clinical report dataset. In the absence of sufficient labelled data, deep neural networks can benefit from a'warmstart', or pre-training, procedure in which parameters are first learned in an arbitrary task. We propose such a task in the form of discriminative multilingual pre-training where tokens from clinical reports are randomly replaced with those from other languages and the network is tasked with predicting the language of all tokens. We show that our method performs on par with state-of-the-art pre-training methods such as MLM, ELECTRA, and MARGE, while simultaneously generating diverse and plausible clinical reports. We also demonstrate that multilingual models can outperform their monolingual counterparts, informally terming this beneficial phenomenon as the'blessing of multilinguality'.


The US Army is Building a Voice Assistant Named JUDI to Control Robots - Voicebot.ai

#artificialintelligence

The United States Army is developing a conversational intelligence platform that will let soldiers give voice commands to robotic vehicles using natural language. Instead of requiring formal commands, the Joint Understanding and Dialogue Interface, JUDI, will be able to understand and interpret intent in its orders, clarifying them with questions as needed. The U.S. Army Combat Capabilities Development Command's Army Research Laboratory is building JUDI in a partnership with the University of Southern California's Institute for Creative Technologies. Their goal for JUDI is to combine an understanding of informal language with data from its sensors to grasp the context of its orders. In the robots used for testing right now, basically very advanced miniature cars JUDI will theoretically be able to take a single command like, "go to the top of the hill" and combine camera data identifying a nearby hill with its natural language processing to work out its goal and how to achieve it, with follow-up questions to the operator as needed.


Army Researchers Create Conversational AI to Improve Soldier-Robot Communications

#artificialintelligence

Talking is our most essential form of communication. It is useful in day to day operations but it becomes even more critical in high-pressure situations such as those encountered by the army personnel. In light of this, army researchers have developed an advanced artificial intelligence (AI) that is capable of carrying on a conversation. Yes! It's a military AI that can speak. The researchers from the U.S. Army Combat Capabilities Development Command's Army Research Laboratory, in collaboration with the University of Southern California's Institute for Creative Technologies, have called their new AI the Joint Understanding and Dialogue Interface, or JUDI for short.


Machine Learning's Obsession with Kids' TV Show Characters

#artificialintelligence

What do they have in common? They're all beloved fictional characters from TV shows many of us watched when we were young. In 2018, researchers at the Allen Institute published the language model ELMo. The lead author, Matt Peters, said the team brainstormed many acronyms for their model, and ELMo instantly stuck as a "whimsical but memorable" choice. What started out as an inside joke has become a full-blown trend.