AITopics

Originally rooted in game theory, the Shapley Value (SV) has recently become an important tool in machine learning research. Perhaps most notably, it is used for feature attribution and data valuation in explainable artificial intelligence. Shapley Interactions (SIs) naturally extend the SV and address its limitations by assigning joint contributions to groups of entities, which enhance understanding of black box machine learning models. Due to the exponential complexity of computing SVs and SIs, various methods have been proposed that exploit structural assumptions or yield probabilistic estimates given limited resources. In this work, we introduce shapiq, an open-source Python package that unifies state-of-the-art algorithms to efficiently compute SVs and any-order SIs in an application-agnostic framework. Moreover, it includes a benchmarking suite containing 11 machine learning applications of SIs with pre-computed games and ground-truth values to systematically assess computational performance across domains. For practitioners, shapiq is able to explain and visualize any-order feature interactions in predictions of models, including vision transformers, language models, as well as XGBoost and LightGBM with TreeSHAP-IQ. With shapiq, we extend shap beyond feature attributions and consolidate the application of SVs and SIs in machine learning that facilitates future research. The source code and documentation are available at https://github.com/mmschlk/shapiq.

config, model evaluation, shapiq, (12 more...)

2410.01649

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > Netherlands (0.04)
(3 more...)

Genre:

Overview (0.92)
Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.90)

ACE: A LLM-based Negotiation Coaching System

Shea, Ryan, Kallala, Aymen, Liu, Xin Lucy, Morris, Michael W., Yu, Zhou

The growing prominence of LLMs has led to an increase in the development of AI tutoring systems. These systems are crucial in providing underrepresented populations with improved access to valuable education. One important area of education that is unavailable to many learners is strategic bargaining related to negotiation. To address this, we develop a LLM-based Assistant for Coaching nEgotiation (ACE). ACE not only serves as a negotiation partner for users but also provides them with targeted feedback for improvement. To build our system, we collect a dataset of negotiation transcripts between MBA students. These transcripts come from trained negotiators and emulate realistic bargaining scenarios. We use the dataset, along with expert consultations, to design an annotation scheme for detecting negotiation mistakes. ACE employs this scheme to identify mistakes and provide targeted feedback to users. To test the effectiveness of ACE-generated feedback, we conducted a user experiment with two consecutive trials of negotiation and found that it improves negotiation performances significantly compared to a system that doesn't provide feedback and one which uses an alternative method of providing feedback.

negotiation, participant, scenario, (17 more...)

2410.01555

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(2 more...)

Industry:

Education > Curriculum > Subject-Specific Education (0.88)
Education > Educational Technology > Educational Software > Computer Based Training (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

InfiniPot: Infinite Context Processing on Memory-Constrained LLMs

Kim, Minsoo, Shim, Kyuhong, Choi, Jungwook, Chang, Simyung

Handling long input contexts remains a significant challenge for Large Language Models (LLMs), particularly in resource-constrained environments such as mobile devices. Our work aims to address this limitation by introducing InfiniPot, a novel KV cache control framework designed to enable pre-trained LLMs to manage extensive sequences within fixed memory constraints efficiently, without requiring additional training. InfiniPot leverages Continual Context Distillation (CCD), an iterative process that compresses and retains essential information through novel importance metrics, effectively maintaining critical data even without access to future context. Our comprehensive evaluations indicate that InfiniPot significantly outperforms models trained for long contexts in various NLP tasks, establishing its efficacy and versatility. This work represents a substantial advancement toward making LLMs applicable to a broader range of real-world scenarios.

context length, infinipot, long context, (15 more...)

2410.01518

Country:

North America > United States > Minnesota (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre:

Research Report (1.00)
Overview (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Iana, Andreea, Glavaš, Goran, Paulheim, Heiko

Peeling Back the Layers: An In-Depth Evaluation of Encoder Architectures in Neural News Recommenders

Encoder architectures play a pivotal role in neural news recommenders by embedding the semantic and contextual information of news and users. Thus, research has heavily focused on enhancing the representational capabilities of news and user encoders to improve recommender performance. Despite the significant impact of encoder architectures on the quality of news and user representations, existing analyses of encoder designs focus only on the overall downstream recommendation performance. This offers a one-sided assessment of the encoders' similarity, ignoring more nuanced differences in their behavior, and potentially resulting in sub-optimal model selection. In this work, we perform a comprehensive analysis of encoder architectures in neural news recommender systems. We systematically evaluate the most prominent news and user encoder architectures, focusing on their (i) representational similarity, measured with the Central Kernel Alignment, (ii) overlap of generated recommendation lists, quantified with the Jaccard similarity, and (iii) the overall recommendation performance. Our analysis reveals that the complexity of certain encoding techniques is often empirically unjustified, highlighting the potential for simpler, more efficient architectures. By isolating the effects of individual components, we provide valuable insights for researchers and practitioners to make better informed decisions about encoder selection and avoid unnecessary complexity in the design of news recommenders.

architecture, recommendation, similarity, (14 more...)

2410.0147

Country:

Europe > Italy > Apulia > Bari (0.05)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(3 more...)

Genre:

Overview (0.68)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Kobenova, Amina, DeVeaux, Cyan, Parajuli, Samyak, Banburski-Fahey, Andrzej, Fernandez, Judith Amores, Lanier, Jaron

Social Conjuring: Multi-User Runtime Collaboration with AI in Building Virtual 3D Worlds

Generative artificial intelligence has shown promise in prompting virtual worlds into existence, yet little attention has been given to understanding how this process unfolds as social interaction. We present Social Conjurer, a framework for AI-augmented dynamic 3D scene co-creation, where multiple users collaboratively build and modify virtual worlds in real-time. Through an expanded set of interactions, including social and tool-based engagements as well as spatial reasoning, our framework facilitates the creation of rich, diverse virtual environments. Findings from a preliminary user study (N=12) provide insight into the user experience of this approach, how social contexts shape the prompting of spatial environments, and perspective on social applications of prompt-based 3D co-creation. In addition to highlighting the potential of AI-supported multi-user world creation and offering new pathways for AI-augmented creative processes in VR, this article presents a set of implications for designing human-centered interfaces that incorporate AI models into 3D content generation.

manuscript, multi-user runtime collaboration, participant, (14 more...)

2410.00274

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (0.92)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (1.00)
Information Technology (0.93)
Energy (0.67)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Patil, Pratik, Du, Jin-Hong, Tibshirani, Ryan J.

Revisiting Optimism and Model Complexity in the Wake of Overparameterized Machine Learning

arXiv.org Machine LearningOct-2-2024

Common practice in modern machine learning involves fitting a large number of parameters relative to the number of observations. These overparameterized models can exhibit surprising generalization behavior, e.g., ``double descent'' in the prediction error curve when plotted against the raw number of model parameters, or another simplistic notion of complexity. In this paper, we revisit model complexity from first principles, by first reinterpreting and then extending the classical statistical concept of (effective) degrees of freedom. Whereas the classical definition is connected to fixed-X prediction error (in which prediction error is defined by averaging over the same, nonrandom covariate points as those used during training), our extension of degrees of freedom is connected to random-X prediction error (in which prediction error is averaged over a new, random sample from the covariate distribution). The random-X setting more naturally embodies modern machine learning problems, where highly complex models, even those complex enough to interpolate the training data, can still lead to desirable generalization performance under appropriate conditions. We demonstrate the utility of our proposed complexity measures through a mix of conceptual arguments, theory, and experiments, and illustrate how they can be used to interpret and compare arbitrary prediction models.

freedom, random-x degree, regression, (17 more...)

arXiv.org Machine Learning

2410.01259

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre:

Research Report (0.50)
Overview (0.45)

Industry: Education (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Nerode, Anil, Liu, Yanhong A.

Integrating Reasoning Systems for Trustworthy AI, Proceedings of the 4th Workshop on Logic and Practice of Programming (LPOP)

Logical reasoning systems are essential for rigorous automatic reasoning. The focus of the 2024 Logic and Practice of Programming workshop is integrating reasoning systems for trustworthy AI, especially including integrating diverse models of programming with rules and constraints. Trustworthy AI requires programming with rules and constraints for expressing and solving knowledge-intensive inference and combinatorial problems. A wide range of programming models have been proposed, including but not limited to the following, and essentially all of them require or support imperative programming for use in practical applications.

large language model, logic & formal reasoning, machine learning, (17 more...)

2410.19738

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
(35 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material (0.93)

Industry:

Health & Medicine (1.00)
Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Recent Advances in Speech Language Models: A Survey

Cui, Wenqian, Yu, Dianzhi, Jiao, Xiaoqi, Meng, Ziqiao, Zhang, Guangyan, Wang, Qichao, Guo, Yiwen, King, Irwin

Large Language Models (LLMs) have recently garnered significant attention, primarily for their capabilities in text-based interactions. However, natural human interaction often relies on speech, necessitating a shift towards voice-based models. A straightforward approach to achieve this involves a pipeline of ``Automatic Speech Recognition (ASR) + LLM + Text-to-Speech (TTS)", where input speech is transcribed to text, processed by an LLM, and then converted back to speech. Despite being straightforward, this method suffers from inherent limitations, such as information loss during modality conversion and error accumulation across the three stages. To address these issues, Speech Language Models (SpeechLMs) -- end-to-end models that generate speech without converting from text -- have emerged as a promising alternative. This survey paper provides the first comprehensive overview of recent methodologies for constructing SpeechLMs, detailing the key components of their architecture and the various training recipes integral to their development. Additionally, we systematically survey the various capabilities of SpeechLMs, categorize the evaluation metrics for SpeechLMs, and discuss the challenges and future research directions in this rapidly evolving field.

large language model, machine learning, natural language, (18 more...)

2410.03751

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(12 more...)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System

Munoz, Gary D. Lopez, Minnich, Amanda J., Lutz, Roman, Lundeen, Richard, Dheekonda, Raja Sekhar Rao, Chikanov, Nina, Jagdagdorj, Bolor-Erdene, Pouliot, Martin, Chawla, Shiven, Maxwell, Whitney, Bullwinkel, Blake, Pratt, Katherine, de Gruyter, Joris, Siska, Charlotte, Bryan, Pete, Westerhoff, Tori, Kawaguchi, Chang, Seifert, Christian, Kumar, Ram Shankar Siva, Zunger, Yonatan

Generative Artificial Intelligence (GenAI) is becoming ubiquitous in our daily lives. The increase in computational power and data availability has led to a proliferation of both single- and multi-modal models. As the GenAI ecosystem matures, the need for extensible and model-agnostic risk identification frameworks is growing. To meet this need, we introduce the Python Risk Identification Toolkit (PyRIT), an open-source framework designed to enhance red teaming efforts in GenAI systems. PyRIT is a model- and platform-agnostic tool that enables red teamers to probe for and identify novel harms, risks, and jailbreaks in multimodal generative AI models. Its composable architecture facilitates the reuse of core building blocks and allows for extensibility to future models and modalities. This paper details the challenges specific to red teaming generative AI systems, the development and features of PyRIT, and its practical applications in real-world scenarios.

machine learning, natural language, password, (19 more...)

2410.02828

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Levine, Lauren, Zeldes, Amir

Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM

Comparing bridging annotations across coreference resources is difficult, largely due to a lack of standardization across definitions and annotation schemas and narrow coverage of disparate text domains across resources. To alleviate domain coverage issues and consolidate schemas, we compare guidelines and use interpretable predictive models to examine the bridging instances annotated in the GUM, GENTLE and ARRAU corpora. Examining these cases, we find that there is a large difference in types of phenomena annotated as bridging. Beyond theoretical results, we release a harmonized, subcategorized version of the test sets of GUM, GENTLE and the ARRAU Wall Street Journal data to promote meaningful and reliable evaluation of bridging resolution across domains.

anaphor, annotation, classifier, (15 more...)

2410.0117

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Dominican Republic (0.04)
(8 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Law (0.46)
Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.52)