Goto

Collaborating Authors

 lance



LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Neural Information Processing Systems

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pre-trained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet.



LANCE: Low Rank Activation Compression for Efficient On-Device Continual Learning

Apolinario, Marco Paul E., Roy, Kaushik

arXiv.org Artificial Intelligence

On-device learning is essential for personalization, privacy, and long-term adaptation in resource-constrained environments. Achieving this requires efficient learning, both fine-tuning existing models and continually acquiring new tasks without catastrophic forgetting. Yet both settings are constrained by high memory cost of storing activations during backpropagation. Existing activation compression methods reduce this cost but relying on repeated low-rank decompositions, introducing computational overhead. Also, such methods have not been explored for continual learning. We propose LANCE (Low-rank Activation Compression), a framework that performs one-shot higher-order Singular Value Decompsoition (SVD) to obtain a reusable low-rank subspace for activation projection. This eliminates repeated decompositions, reducing both memory and computation. Moreover, fixed low-rank subspaces further enable on-device continual learning by allocating tasks to orthogonal subspaces without storing large task-specific matrices. Experiments show that LANCE reduces activation storage up to 250$\times$ while maintaining accuracy comparable to full backpropagation on CIFAR-10/100, Oxford-IIIT Pets, Flowers102, and CUB-200 datasets. On continual learning benchmarks (Split CIFAR-100, Split MiniImageNet, 5-Datasets), it achieves performance competitive with orthogonal gradient projection methods at a fraction of the memory cost. These results position LANCE as a practical and scalable solution for efficient fine-tuning and continual learning on edge devices.


What Do Commercials About A.I. Really Promise?

The New Yorker

If a recent crop of commercials touting the benefits of artificial intelligence is any indication, lots of Americans these days feel unduly burdened by the demands of everyday cognition. Apparently, it's asking way too much to expect a human to figure out how to make a small repair, or write a note to a friend, or plan a meal to feed a child. I've got a perverse favorite among these ads, for Apple Intelligence. A sharp-looking Black man of maybe fifty named Lance sits down at a drab, clean conference table full of colleagues. Somebody asks him if he's read a "prospectus," and Lance decides to lie about it.


LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Neural Information Processing Systems

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pre-trained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet.


Language Models as Continuous Self-Evolving Data Engineers

Wang, Peidong, Wang, Ming, Ma, Zhiming, Yang, Xiaocui, Feng, Shi, Wang, Daling, Zhang, Yifei

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated remarkable capabilities on various tasks, while the further evolvement is limited to the lack of high-quality training data. In addition, traditional training approaches rely too much on expert-labeled data, setting an upper limit on the performance of LLMs. To address this issue, we propose a novel paradigm that enables LLMs to train itself by autonomously generating, cleaning, reviewing, and annotating data with preference information, named LANCE. Our approach demonstrates that LLMs can serve as continuous self-evolving data engineers, significantly reducing the time and cost of the post-training data construction process. Through iterative fine-tuning on different variants of the Qwen2, we validate the effectiveness of LANCE across various tasks, showing that it can continuously improve model performance and maintain high-quality data generation. Across eight benchmark dimensions, LANCE resulted in an average score enhancement of 3.36 for Qwen2-7B and 2.70 for Qwen2-7B-Instruct. This training paradigm with autonomous data construction not only reduces the reliance on human experts or external models but also ensures that the data aligns with human values and preferences, paving the way for the development of future superintelligent systems that can exceed human capabilities.


LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Prabhu, Viraj, Yenamandra, Sriram, Chattopadhyay, Prithvijit, Hoffman, Judy

arXiv.org Artificial Intelligence

We propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language modeling and text-based image editing to augment an IID test set with a suite of diverse, realistic, and challenging test images without altering model weights. We benchmark the performance of a diverse set of pre-trained models on our generated data and observe significant and consistent performance drops. We further analyze model sensitivity across different types of edits, and demonstrate its applicability at surfacing previously unknown class-level model biases in ImageNet. Code is available at https://github.com/virajprabhu/lance.


ApoQlar's VSI Promises the Future of Surgery with Hololens

#artificialintelligence

Surgery is a major hub for medical simulation innovation, and apoQlar continues that trend with their Virtual Surgery Intelligence (VSI) line of products which can be used during procedures, for education, and for patient teaching. This German-based startup seeks to move the world, and focuses on creative and active ways for healthcare professionals to learn. VSI is a smart solution using artificial intelligence which displays MRI and CT images in 3D inside a Mixed/Augmented Reality headset. These images are generated from the original CT or MRI scans. The VSI creates a comprehensive representation, including all anatomical structures, which can be moved around effortlessly in the user's field of vision.


Training our humans on the wrong dataset

#artificialintelligence

I really don't want to say that I've figured out the majority of what's wrong with modern education and how to fix it, BUT When we train (fit) any given ML model for a specific problem, on which we have a training dataset, there are several ways we go about it, but all of them involve using that dataset. Say we're training a model that takes a 2d image of some glassware and turn it into a 3d rendering. We have images of 2000 glasses from different angles and in different lighting conditions and an associated 3d model. How do we go about training the model? Well, arguable, we could start small then feed the whole dataset, we could use different sizes for test/train/validation, we could use cv to determine the overall accuracy of our method or decide it would take to long... etc But I'm fairly sure that nobody will ever say: I know, let's take a dataset of 2d images of cars and their 3d rendering and train the model on that first.