AITopics | Lokam, Satya

Collaborating Authors

Lokam, Satya

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model

Aggarwal, Divyanshu, Damle, Sankarshan, Goyal, Navin, Lokam, Satya, Sitaram, Sunayana

arXiv.org Artificial IntelligenceOct-21-2024

A common challenge towards the adaptability of Large Language Models (LLMs) is their ability to learn new languages over time without hampering the model's performance on languages in which the model is already proficient (usually English). Continual fine-tuning (CFT) is the process of sequentially fine-tuning an LLM to enable the model to adapt to downstream tasks with varying data distributions and time shifts. This paper focuses on the language adaptability of LLMs through CFT. We study a two-phase CFT process in which an English-only end-to-end fine-tuned LLM from Phase 1 (predominantly Task Ability) is sequentially fine-tuned on a multilingual dataset -- comprising task data in new languages -- in Phase 2 (predominantly Language Ability). We observe that the ``similarity'' of Phase 2 tasks with Phase 1 determines the LLM's adaptability. For similar phase-wise datasets, the LLM after Phase 2 does not show deterioration in task ability. In contrast, when the phase-wise datasets are not similar, the LLM's task ability deteriorates. We test our hypothesis on the open-source \mis\ and \llm\ models with multiple phase-wise dataset pairs. To address the deterioration, we analyze tailored variants of two CFT methods: layer freezing and generative replay. Our findings demonstrate their effectiveness in enhancing the language ability of LLMs while preserving task performance, in comparison to relevant baselines.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.16006

Genre: Research Report > New Finding (0.86)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

SLIP: Securing LLMs IP Using Weights Decomposition

Refael, Yehonathan, Hakim, Adam, Greenberg, Lev, Aviv, Tal, Lokam, Satya, Fishman, Ben, Seidman, Shachar

arXiv.org Machine LearningJul-15-2024

Large language models (LLMs) have recently seen widespread adoption, in both academia and industry. As these models grow, they become valuable intellectual property (IP), reflecting enormous investments by their owners. Moreover, the high cost of cloud-based deployment has driven interest towards deployment to edge devices, yet this risks exposing valuable parameters to theft and unauthorized use. Current methods to protect models' IP on the edge have limitations in terms of practicality, loss in accuracy, or suitability to requirements. In this paper, we introduce a novel hybrid inference algorithm, named SLIP, designed to protect edge-deployed models from theft. SLIP is the first hybrid protocol that is both practical for real-world applications and provably secure, while having zero accuracy degradation and minimal impact on latency. It involves partitioning the model between two computing resources, one secure but expensive, and another cost-effective but vulnerable. This is achieved through matrix decomposition, ensuring that the secure resource retains a maximally sensitive portion of the model's IP while performing a minimal amount of computations, and vice versa for the vulnerable resource. Importantly, the protocol includes security guarantees that prevent attackers from exploiting the partition to infer the secured information. Finally, we present experimental results that show the robustness and effectiveness of our method, positioning it as a compelling solution for protecting LLMs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2407.10886

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Query Complexity of Training Data Reconstruction in Private Learning

Mukherjee, Prateeti, Lokam, Satya

arXiv.org Machine LearningOct-8-2023

We analyze the number of queries that a whitebox adversary needs to make to a private learner in order to reconstruct its training data. For (ϵ, δ) DP learners with training data drawn from any arbitrary compact metric space, we provide the first known lower bounds on the adversary's query complexity as a function of the learner's privacy parameters. Our results are minimax optimal for every ϵ 0, δ [0, 1], covering both ϵ-DP and (0, δ) DP as corollaries. Beyond this, we obtain query complexity lower bounds for (α, ϵ) Rényi DP learners that are valid for any α > 1, ϵ 0. Finally, we analyze data reconstruction attacks on locally compact metric spaces via the framework of Metric DP, a generalization of DP that accounts for the underlying metric structure of the data. In this setting, we provide the first known analysis of data reconstruction in unbounded, high dimensional spaces and obtain query complexity lower bounds that are nearly tight modulo logarithmic factors.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2303.16372

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback