Goto

Collaborating Authors

 sengupta


Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation

Sengupta, Ayan, Seth, Vaibhav, Pathak, Arinjay, Raman, Natraj, Gopalakrishnan, Sriram, Chakraborty, Tanmoy

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are highly resource-intensive to fine-tune due to their enormous size. While low-rank adaptation is a prominent parameter-efficient fine-tuning approach, it suffers from sensitivity to hyperparameter choices, leading to instability in model performance on fine-tuning downstream tasks. This paper highlights the importance of effective parameterization in low-rank fine-tuning to reduce estimator variance and enhance the stability of final model outputs. We propose MonteCLoRA, an efficient fine-tuning technique, employing Monte Carlo estimation to learn an unbiased posterior estimation of low-rank parameters with low expected variance, which stabilizes fine-tuned LLMs with only O(1) additional parameters. MonteCLoRA shows significant improvements in accuracy and robustness, achieving up to 3.8% higher accuracy and 8.6% greater robustness than existing efficient fine-tuning methods on natural language understanding tasks with pre-trained RoBERTa-base. Furthermore, in generative tasks with pre-trained LLaMA-1-7B, MonteCLoRA demonstrates robust zero-shot performance with 50% lower variance than the contemporary efficient fine-tuning methods. The theoretical and empirical results presented in the paper underscore how parameterization and hyperpriors balance exploration-exploitation in the low-rank parametric space, therefore leading to more optimal and robust parameter estimation during efficient fine-tuning.


SIGMA: Single Interpolated Generative Model for Anomalies

Das, Ranit, Shih, David

arXiv.org Artificial Intelligence

A key step in any resonant anomaly detection search is accurate modeling of the background distribution in each signal region. Data-driven methods like CATHODE accomplish this by training separate generative models on the complement of each signal region, and interpolating them into their corresponding signal regions. Having to re-train the generative model on essentially the entire dataset for each signal region is a major computational cost in a typical sliding window search with many signal regions. Here, we present SIGMA, a new, fully data-driven, computationally-efficient method for estimating background distributions. The idea is to train a single generative model on all of the data and interpolate its parameters in sideband regions in order to obtain a model for the background in the signal region. The SIGMA method significantly reduces the computational cost compared to previous approaches, while retaining a similar high quality of background modeling and sensitivity to anomalous signals.


AI and the future of work: Everything is about to change

#artificialintelligence

In just a few months, you'll be able to ask a virtual assistant to transcribe meeting notes during a work call, summarize long email threads to quickly draft suggested replies, quickly create a specific chart in Excel, and turn a Word document into a PowerPoint presentation in seconds. Over the past week, a rapidly evolving artificial intelligence landscape seemed to leap ahead again. Microsoft and Google each unveiled new AI-powered features for their signature productivity tools and OpenAI introduced its next-generation version of the technology that underpins its viral chatbot tool, ChatGPT. Suddenly, AI tools, which have long operated in the background of many services, are now more powerful and more visible across a wide and growing range of workplace tools. Google's new features, for example, promise to help "brainstorm" and "proofread" written work in Docs.


Artificial Intelligence Needs To Speak The Language Of Business, Not The Other Way Around

#artificialintelligence

Almost every business leader on the planet, 94%, believe AI will be critical to success over the next five years. Still, as Deloitte's latest research on the state of AI finds, many companies still aren't achieving the value they anticipated -- there has been a 29% increase in the share of respondents who identify as AI "underachievers" this year as compared to the last year. Issues diminishing the impact of AI include challenges improving its business value and a lack of full executive commitment, the Deloitte survey shows. Industry leaders and observers in the trenches agree that it is these organizational issues, rather than technical issues, that are holding back progress. An important point is that AI needs to serve the customer, and help the business put the customer front and center.


Can companies make decisions with AI?

#artificialintelligence

AI can play many roles in the technology stack of a modern enterprise. Its performance as a neutral, data-based, analytical advisor could allow businesses to use algorithms to predict whether a decision is the right one. AI-based decisions are part of an arsenal of tools leveraged by technology high performers. Businesses led by digitally savvy leaders, those who champion emerging technologies such as AI, outperform other like-sized businesses by 48% on valuation and revenue growth, according to one MIT research study. "The integration of traditional decisioning into AI is really just starting to hit its stride right now," said Rowan Curran, analyst at Forrester.


How Colleges Are Using Artificial Intelligence To Improve Enrollment And Retention

#artificialintelligence

More colleges are using artificial intelligence to increase their enrollments, target financial aid ... [ ] and improve retention rates. Artificial intelligence (AI) has gradually become accepted by colleges and universities as an effective tool for automating a number of tasks effectively and efficiently. Chatbots can answer students' questions about class scheduling or check in with them about their mental health. AI-generated emails can remind students about important deadlines, prompt them to register for classes, turn in assignments and pay their fees on time. And, in a particularly controversial use, AI-based software is increasingly able to detect plagiarized assignments.


Key insights that will help you make the most of AI

#artificialintelligence

It's easy to get sucked into the hype around artificial intelligence (AI), but just as easy to get duped into thinking it's all hype. The truth is somewhere in the middle. Or, as tech luminary Mike Olson suggested, "The breathless attention paid to AGI and self-driving cars and whatnot blinds [us] to the value of narrowly-focused AI applications." By "narrowly focused" he was referring to the DeepMind announcement that it had released the "predicted structures for nearly all catalogued proteins known to science". This advance dramatically opens access to protein structures, thereby accelerating scientific discovery in fields as diverse as medicine and climate change. But the AI used is narrow in the sense that it isn't some sentient machine, thinking through protein structures.


Amazon drones may start to deliver packages in Northern California this year

Los Angeles Times

Amazon plans to begin delivering some packages by drone to homes in a few Northern California communities this year, the company said Monday. Residents of San Joaquin County farming towns Lockeford and Acampo, as well as parts of Lodi, will be able to order "thousands of everyday items" online and can expect a drone to drop them in their backyards in less than an hour, said Av Zammit, an Amazon spokesperson. The Amazon Prime Air drones can carry packages that weigh 5 pounds or less -- such as beauty and cosmetic items, office and tech supplies, batteries and household items -- and will typically be the size of a large shoebox, Zammit said. The company is building a facility in Lockeford from which the drones will launch. Though Amazon Prime Air received certification to commercially fly cargo in 2020, it is still seeking approval from the Federal Aviation Administration and county officials for its plans in San Joaquin County.


Sengupta

AAAI Conferences

Recent works on gradient-based attacks and universal perturbations can adversarially modify images to bring down the accuracy of state-of-the-art classification techniques based on deep neural networks to as low as 10% on popular datasets like MNIST and ImageNet. The design of general defense strategies against a wide range of such attacks remains a challenging problem. In this paper, we derive inspiration from recent advances in the fields of cybersecurity and multi-agent systems and propose to use the concept of Moving Target Defense (MTD) for increasing the robustness of a set of deep networks against such adversarial attacks. To this end, we formalize and exploit the notion of differential immunity of an ensemble of networks to specific attacks. To classify an input image, a trained network is picked from this set of networks by formulating the interaction between a Defender (who hosts the classification networks) and their (Legitimate and Malicious) Users as a repeated Bayesian Stackelberg Game (BSG).We empirically show that our approach, MTDeep reduces misclassification on perturbed images for MNIST and ImageNet datasets while maintaining high classification accuracy on legitimate test images. Lastly, we demonstrate that our framework can be used in conjunction with any existing defense mechanism to provide more resilience to adversarial attacks than those defense mechanisms by themselves.


Sengupta

AAAI Conferences

Proactive Decision Support (PDS) aims at improving the decision making experience of human decision makers by enhancing both the quality of the decisions and the ease of making them. In this paper, we ask the question what role automated decision-making technologies can play in the deliberative process of the human decision maker.Specifically, we focus on expert humans in the loop who now share a detailed, if not complete, model of the domain with the assistant, but may still be unable to compute plans due to cognitive overload. To this end, we propose a PDS framework RADAR based on research in the automated planning community that aids the human decision maker in constructing plans. We will situate our discussion on principles of interface design laid out in the literature on the degrees of automation and its effect on the collaborative decision-making process. Also, at the heart of our design is the principle of naturalistic decision making which has been shown to be a necessary requirement of such systems, thus focusing more on providing suggestions rather than enforcing decisions and executing actions. We will demonstrate the different properties of such a system through examples in a fire-fighting domain, where human commanders are involved in building response strategies to mitigate a fire outbreak.The paper is written to serve both as a position paper by motivating requirements of an effective proactive decision support system, and also an emerging application of these ideas in the context of the role of an automated planner in human decision making, in a platform that can prove to be a valuable test bed for research on the same.