Goto

Collaborating Authors

 lamb


Exploring Landscapes for Better Minima along Valleys

Zhao, Tong, Li, Jiacheng, Zhou, Yuanchang, Tan, Guangming, Jia, Weile

arXiv.org Machine Learning

Finding lower and better-generalizing minima is crucial for deep learning. However, most existing optimizers stop searching the parameter space once they reach a local minimum. Given the complex geometric properties of the loss landscape, it is difficult to guarantee that such a point is the lowest or provides the best generalization. To address this, we propose an adaptor "E" for gradient-based optimizers. The adapted optimizer tends to continue exploring along landscape valleys (areas with low and nearly identical losses) in order to search for potentially better local minima even after reaching a local minimum. This approach increases the likelihood of finding a lower and flatter local minimum, which is often associated with better generalization. We also provide a proof of convergence for the adapted optimizers in both convex and non-convex scenarios for completeness. Finally, we demonstrate their effectiveness in an important but notoriously difficult training scenario, large-batch training, where Lamb is the benchmark optimizer. Our testing results show that the adapted Lamb, ALTO, increases the test accuracy (generalization) of the current state-of-the-art optimizer by an average of 2.5% across a variety of large-batch training tasks. This work potentially opens a new research direction in the design of optimization algorithms.


MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training

Luo, Yang, Zheng, Zangwei, Qin, Ziheng, Zhu, Zirui, Liu, Yong, You, Yang

arXiv.org Artificial Intelligence

Large-batch training has become a cornerstone in accelerating the training of deep neural networks, yet it poses challenges in optimization and generalization. Existing optimizers like AdamW present performance degradation during language models' large-batch training, due to the information bottleneck in attention layers caused by the sharp increase of max attention logit. While the LAMB optimizer partially addresses this issue, some attention layers still face this issue. The reason is that $l_2$-norm-based trust ratios in LAMB are less effective in directly influencing the max value of query/key weights. Furthermore, the weight-wise trust ratio in LAMB is error-prone as it overlooks relationships of weight values within rows or columns. Building on these observations, we propose a novel optimizer, MERIT, which leverages the max-norm to calculate the trust ratio to constrain the max attention logit more effectively. Moreover, we further construct element-wise trust ratios to provide more robust update scaling by focusing on local weight structures. Extensive experiments of large-batch training across various sizes of GPT-2 models demonstrate the superior performance of MERIT. Notably, during the training of GPT-2 Medium, MERIT enables a 6k batch size without any performance degradation compared to the standard batch size (480) with 48B training tokens. This work highlights the importance of considering the max attention logit and finer-granularity trust ratio in large-batch training. It successfully improves the training stability and paves the way for larger batch usage, enabling faster development and iteration of large language models. Code is available at https://github.com/NUS-HPC-AI-Lab/MERIT.


LABIIUM: AI-Enhanced Zero-configuration Measurement Automation System

Olowe, Emmanuel A., Chitnis, Danial

arXiv.org Artificial Intelligence

The complexity of laboratory environments requires solutions that simplify instrument interaction and enhance measurement automation. Traditional tools often require configuration, software, and programming skills, creating barriers to productivity. Previous approaches, including dedicated software suites and custom scripts, frequently fall short in providing user-friendly solutions that align with programming practices. We present LABIIUM, an AI-enhanced, zero-configuration measurement automation system designed to streamline experimental workflows and improve user productivity. LABIIUM integrates an AI assistant powered by Large Language Models (LLMs) to generate code. LABIIUM's Lab-Automation-Measurement Bridges (LAMBs) enable seamless instrument connectivity using standard tools such as VSCode and Python, eliminating setup overhead. To demonstrate its capabilities, we conducted experiments involving the measurement of the parametric transfer curve of a simple two-transistor inverting amplifier with a current source load. The AI assistant was evaluated using different prompt scenarios and compared with multiple models, including Claude Sonnet 3.5, Gemini Pro 1.5, and GPT-4o. An expert solution implementing the Gradient-Weighted Adaptive Stochastic Sampling (GWASS) method was used as a baseline. The solutions generated by the AI assistant were compared with the expert solution and a uniform linear sweep baseline with 10,000 points. The graph results show that the LLMs were able to successfully complete the most basic uniform sweep, but LLMs were unable to develop adaptive sweeping algorithms to compete with GWASS. The evaluation underscores LABIIUM's ability to enhance laboratory productivity and support digital transformation in research and industry, and emphasizes the future work required to improve LLM performance in Electronic Measurement Science Tasks.


The 'Sex Update' for 'Cult of the Lamb' Is a Good Sign for Horny Video Games

WIRED

The pitch is straightforward: What if this video game had sex? Every game deserves a sex update, so the meme goes. In November, Cult of the Lamb, Massive Monster's adorable, animal-themed roguelike about building and maintaining a cult, got in on the action: "We will add sex to the game if we hit 300k followers by the end of the year," the game's official account tweeted, in the style of the meme's pseudo-horny forefathers. Video games have long served as an amorous playground. Some incorporate sex and romance directly into gameplay, as in BioWare's Dragon Age and Mass Effect games.


Rational Sensibility: LLM Enhanced Empathetic Response Generation Guided by Self-presentation Theory

Sun, Linzhuang, Xu, Nan, Wei, Jingxuan, Yu, Bihui, Bu, Liping, Luo, Yin

arXiv.org Artificial Intelligence

Having the ability to empathize is crucial for accurately representing human behavior during conversations. Despite numerous research aim to improve the cognitive capability of models by incorporating external knowledge, there has been limited attention on the sensible and rational expression of the conversation itself, which are crucial components of the cognitive empathy. Guided by self-presentation theory in sociology, we have designed an innovative categorical approach that segregates historical dialogues into sensible and rational sentences and subsequently elucidate the context through the designed attention mechanism. However, the rational information within the conversation is restricted and the external knowledge used in previous methods have limitations of semantic contradiction and narrow vision field. Considering the impressive performance of LLM in the domain of intelligent agent. We employ LLaMA2-70b as a rational brain to analyze the profound logical information maintained in conversations, which assists the model assessing the balance of sensibility and rationality to produce quality empathetic responses. Experimental evaluations demonstrate that our method outperforms other comparable methods on both automatic and human evaluations.


Evaluating Large Language Model Creativity from a Literary Perspective

Shanahan, Murray, Clarke, Catherine

arXiv.org Artificial Intelligence

This paper assesses the potential for large language models (LLMs) to serve as assistive tools in the creative writing process, by means of a single, in-depth case study. In the course of the study, we develop interactive and multi-voice prompting strategies that interleave background descriptions (scene setting, plot elements), instructions that guide composition, samples of text in the target style, and critical discussion of the given samples. We qualitatively evaluate the results from a literary critical perspective, as well as from the standpoint of computational creativity (a sub-field of artificial intelligence). Our findings lend support to the view that the sophistication of the results that can be achieved with an LLM mirrors the sophistication of the prompting.


Clinical BioBERT Hyperparameter Optimization using Genetic Algorithm

Kollapally, Navya Martin, Geller, James

arXiv.org Artificial Intelligence

Clinical factors account only for a small portion, about 10-30%, of the controllable factors that affect an individual's health outcomes. The remaining factors include where a person was born and raised, where he/she pursued their education, what their work and family environment is like, etc. These factors are collectively referred to as Social Determinants of Health (SDoH). The majority of SDoH data is recorded in unstructured clinical notes by physicians and practitioners. Recording SDoH data in a structured manner (in an EHR) could greatly benefit from a dedicated ontology of SDoH terms. Our research focuses on extracting sentences from clinical notes, making use of such an SDoH ontology (called SOHO) to provide appropriate concepts. We utilize recent advancements in Deep Learning to optimize the hyperparameters of a Clinical BioBERT model for SDoH text. A genetic algorithm-based hyperparameter tuning regimen was implemented to identify optimal parameter settings. To implement a complete classifier, we pipelined Clinical BioBERT with two subsequent linear layers and two dropout layers. The output predicts whether a text fragment describes an SDoH issue of the patient. We compared the AdamW, Adafactor, and LAMB optimizers. In our experiments, AdamW outperformed the others in terms of accuracy.


The 10 Best and Cruelest Games of 2022

WIRED

In 2022, the best games were made for masochists. After several years of boom times for wholesome stories and colorful worlds, 2022 reminded us that sometimes there's no truer form of fun than failing horribly, repeatedly. FromSoftware often leads that charge, thanks to series like Dark Souls. This year, it rose to its own challenge. Elden Ring, maddening in its difficulty and unusually cruel in its creative ways to kill you, took center stage as players picked apart its every secret.


Engadget's favorite games of 2022

Engadget

While 2022 may not have enjoyed as many AAA releases as in past years, the ones that weren't delayed into 2023 were stellar and the indie development scene more than made up for the lack of big-budget titles. Some of our favorite releases this year came from small, ambitious teams that delivered fresh ideas. As is tradition, the Engadget team came together to extol the virtues of our favorite releases from the past 12 months. Bayonetta 3 is a delicious amplification of the series' most ridiculous themes. It indulges in absurdity without disrupting the rapid-fire combat or Bayonetta's unrivaled sense of fashion and wit. Bayonetta 3 is joyful, mechanically rich and full of action, plus it allows players to transform into a literal hell train in order to take down massive beasts bent on destroying the multiverse. The Bayonetta series just keeps getting weirder, but that doesn't mean it's losing its sense of satisfying gameplay along the way. In the franchise's third installment, Bayonetta is powerful, confident and funny; she's a drag queen in a universe loosely held together by witchcraft, and the chaos of this combination is truly magical. Sure, you've played Animal Crossing, Stardew Valley, Hades and The Binding of Isaac – but what if you could play all of them at once, in a single adorable demonic package? Cult of the Lamb is part social and farming simulator, part dungeon-crawling roguelike and all-around fantastic. After being sacrificed and resurrected, you're instructed by a grand, dark deity to start your own cult, managing worship services, agriculture, cooking, marriages, deaths and much more.


Neurosymbolic AI

Communications of the ACM

The ongoing revolution in artificial intelligence (AI)--in image recognition, natural language processing and translation, and much more--has been driven by neural networks, specifically many-layer versions known as deep learning. These systems have well-known weaknesses, but their capability continues to grow, even as they demand ever more data and energy. At the same time, other critical applications need much more than just powerful pattern recognition, and deep learning does not provide the sorts of performance guarantees that are customary in computer science. To address these issues, some researchers favor combining neural networks with older tools for artificial intelligence. In particular, neurosymbolic AI incorporates the long-studied symbolic representation of objects and their relationships.