Goto

Collaborating Authors

 loft


LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios

Huang, Zhiyuan, Chen, Jiahao, Liu, Yurou, Su, Bing

arXiv.org Artificial Intelligence

Long-tailed learning has garnered increasing attention due to its wide applicability in real-world scenarios. Among existing approaches, Long-Tailed Semi-Supervised Learning (L TSSL) has emerged as an effective solution by incorporating a large amount of unlabeled data into the imbalanced labeled dataset. However, most prior L TSSL methods are designed to train models from scratch, which often leads to issues such as overconfidence and low-quality pseudo-labels. To address these challenges, we extend L TSSL into the foundation model fine-tuning paradigm and propose a novel framework: LoFT (Long-tailed semi-supervised learning via parameter-efficient Fine-Tuning). We demonstrate that fine-tuned foundation models can generate more reliable pseudolabels, thereby benefiting imbalanced learning. Furthermore, we explore a more practical setting by investigating semi-supervised learning under open-world conditions, where the unlabeled data may include out-of-distribution (OOD) samples. To handle this problem, we propose LoFT - OW (LoFT under Open-World scenarios) to improve the discriminative ability. Experimental results on multiple benchmarks demonstrate that our method achieves superior performance compared to previous approaches, even when utilizing only 1% of the unlabeled data compared with previous works.


LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning

Tastan, Nurbek, Laskaridis, Stefanos, Takac, Martin, Nandakumar, Karthik, Horvath, Samuel

arXiv.org Artificial Intelligence

Large pre-trained models are commonly adapted to downstream tasks using parameter-efficient fine-tuning methods such as Low-Rank Adaptation (LoRA), which injects small trainable low-rank matrices instead of updating all weights. While LoRA dramatically reduces trainable parameters with little overhead, it can still underperform full fine-tuning in accuracy and often converges more slowly. We introduce LoFT, a novel low-rank adaptation method that behaves like full fine-tuning by aligning the optimizer's internal dynamics with those of updating all model weights. LoFT not only learns weight updates in a low-rank subspace (like LoRA) but also properly projects the optimizer's first and second moments (Adam's momentum and variance) into the same subspace, mirroring full-model updates. By aligning the low-rank update itself with the full update, LoFT eliminates the need for tuning extra hyperparameters, e.g., LoRA scaling factor $α$. Empirically, this approach substantially narrows the performance gap between adapter-based tuning and full fine-tuning and consistently outperforms standard LoRA-style methods, all without increasing inference cost.


Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models

Qiu, Yifu, Embar, Varun, Zhang, Yizhe, Jaitly, Navdeep, Cohen, Shay B., Han, Benjamin

arXiv.org Artificial Intelligence

Recent advancements in long-context language models (LCLMs) promise to transform Retrieval-Augmented Generation (RAG) by simplifying pipelines. With their expanded context windows, LCLMs can process entire knowledge bases and perform retrieval and reasoning directly -- a capability we define as In-Context Retrieval and Reasoning (ICR^2). However, existing benchmarks like LOFT often overestimate LCLM performance by providing overly simplified contexts. To address this, we introduce ICR^2, a benchmark that evaluates LCLMs in more realistic scenarios by including confounding passages retrieved with strong retrievers. We then propose three methods to enhance LCLM performance: (1) retrieve-then-generate fine-tuning, (2) retrieval-attention-probing, which uses attention heads to filter and de-noise long contexts during decoding, and (3) joint retrieval head training alongside the generation head. Our evaluation of five well-known LCLMs on LOFT and ICR^2 demonstrates significant gains with our best approach applied to Mistral-7B: +17 and +15 points by Exact Match on LOFT, and +13 and +2 points on ICR^2, compared to vanilla RAG and supervised fine-tuning, respectively. It even outperforms GPT-4-Turbo on most tasks despite being a much smaller model.


Stable Training of Normalizing Flows for High-dimensional Variational Inference

Andrade, Daniel

arXiv.org Machine Learning

Variational inference with normalizing flows (NFs) is an increasingly popular alternative to MCMC methods. In particular, NFs based on coupling layers (Real NVPs) are frequently used due to their good empirical performance. In theory, increasing the depth of normalizing flows should lead to more accurate posterior approximations. However, in practice, training deep normalizing flows for approximating high-dimensional posterior distributions is often infeasible due to the high variance of the stochastic gradients. In this work, we show that previous methods for stabilizing the variance of stochastic gradient descent can be insufficient to achieve stable training of Real NVPs. As the source of the problem, we identify that, during training, samples often exhibit unusual high values. As a remedy, we propose a combination of two methods: (1) soft-thresholding of the scale in Real NVPs, and (2) a bijective soft log transformation of the samples. We evaluate these and other previously proposed modification on several challenging target distributions, including a high-dimensional horseshoe logistic regression model. Our experiments show that with our modifications, stable training of Real NVPs for posteriors with several thousand dimensions is possible, allowing for more accurate marginal likelihood estimation via importance sampling. Moreover, we evaluate several common training techniques and architecture choices and provide practical advise for training NFs for high-dimensional variational inference.


Steven Soderbergh's "Kimi" Is a Tech Thriller That Packs a Potent Outrage

The New Yorker

Steven Soderbergh, who has become admirably prolific in the age of streaming, is a director of paradox. He positions himself as a classical professional who can take on any subject and personalize it with his own style and range of obsessions. But, regardless of his manifest skills and pleasures, the quality of his work fluctuates widely, depending on his connection to the subject matter. Of all current Hollywood filmmakers, Soderbergh is the most physical, the one who comes the closest to the painterly ideal of touching the image. He has long been doing his own camera work (under the pseudonym of Peter Andrews) and also his own editing (as Mary Ann Bernard), and the way that he engages with his subject evokes a bodily music, something like dance--a cinematic swing.


Learning Symbolic Operators for Task and Motion Planning

Silver, Tom, Chitnis, Rohan, Tenenbaum, Joshua, Kaelbling, Leslie Pack, Lozano-Perez, Tomas

arXiv.org Artificial Intelligence

Robotic planning problems in hybrid state and action spaces can be solved by integrated task and motion planners (TAMP) that handle the complex interaction between motion-level decisions and task-level plan feasibility. TAMP approaches rely on domain-specific symbolic operators to guide the task-level search, making planning efficient. In this work, we formalize and study the problem of operator learning for TAMP. Central to this study is the view that operators define a lossy abstraction of the transition model of the underlying domain. We then propose a bottom-up relational learning method for operator learning and show how the learned operators can be used for planning in a TAMP system. Experimentally, we provide results in three domains, including long-horizon robotic planning tasks. We find our approach to substantially outperform several baselines, including three graph neural network-based model-free approaches based on recent work. Video: https://youtu.be/iVfpX9BpBRo


First Look: Callaway Mavrik Irons and Hybrids MyGolfSpy

#artificialintelligence

When nine Callaway Mavrik drivers showed up on the USGA conforming list in mid-December, there was the expected here we go again refrain mixed in with a couple of sideways glances and some descriptions which were anything but parliamentary. That said, as noted in MyGolfSpy's 2019 Editor's Choice awards, the AI (Artificial Intelligence) component of Callaway's signature Flash Face technology is a new club technology that's likely to impact club design throughout the industry – and with the Mavrik irons, it's clear Callaway is dedicated to extending its use throughout its hardgoods lineup. So not to bury the lede, Callaway's Mavrik irons (3 models) incorporate for the first time, AI face design…in every iron. Yes, each individual iron will have a different face thanks to AI capabilities though as sets progress toward shorter irons (8-iron, 9-iron, PW) the designs are more similar than different due to the role loft plays in performance. Specifically, clubs with more loft result in less blunt impact conditions and therefore the face technology (materials, design, etc) has a reduced impact.


Artificial intelligence developer contest puts Alexa to the test

#artificialintelligence

Amazon Web Services and Merck announced a developer competition on Monday to that plans to harness artificial intelligence for diabetics. Dubbed the Alexa Diabetes Challenge, and powered by Luminary Labs, the contest aims to incent upstarts and individual developers to create apps that harness Amazon's Alexa voice-enabled technologies particularly for patients recently diagnosed with Type 2 diabetes. Early adopters such as Penn Medicine and Commonwealth Care Alliance are already running proofs-of-concept or pilot programs with Amazon Alexa being the centerpiece of improving patient experience. What's more, a new Healthcare IT News and HIMSS Analytics HIT Market Indicator report found that half of hospitals intend to adopt some fashion of AI in five years and more than a third plan to do so within two years. The Amazon and Merck contest enables innovators to use not only Alexa but also Amazon Web Services cloud infrastructure.


Manulife Bhd : Opens New Innovation Lab in Singapore 4-Traders

#artificialintelligence

Manulife has announced a new Lab of Forward Thinking (LOFT) location in Singapore, its third technology hub. LOFT, with existing operations in Toronto and Boston, explores emerging technologies in efforts to leverage them for its asset management and insurance businesses. "Our goal is to become the most customer-centric organization in our industry," said Roy Gori, president and CEO, Manulife Asia, in a statement. "Through the LOFT, we encourage experimentation, incubation and collaboration to find the answers we need to best serve our customers." Over the last six months, existing LOFTs have announced partnerships with ConsenSys, BlockApps, Nervana Systems and indico data solutions.


Manulife VP Explains AI's Role in Insurance

#artificialintelligence

Manulife's Lab of Forward Thinking (LOFT) is undergoing efforts to accelerate its adoption of new technologies. In April, the Canadian insurer's U.S. unit John Hancock took a deep dive into blockchain--collaborating with startups ConsenSys and BlockApps. More recently, Manulife began exploring artificial intelligence and deep learning in its innovation labs through partnerships with Nervana Systems and indico data solutions. Once complete, the platform will serve as a means for research analysts to analyze and decipher natural language from public sources to make smarter investment decisions, the company said. Jesse Bean, LOFT's chief experience officer and VP, head of innovation, global solutions delivery spoke with INN about how the insurer is exploring deep learning and how the technology fits into Manulife's innovation strategy.