AITopics | mend

Collaborating Authors

mend

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6f1d43d5a82a37e89b0665b33bf3a182-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 05:46:27 GMT

liberty island, rome, sonic drift 2, (11 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > Scotland (0.05)
Europe > Albania > Tirana County > Tirana (0.04)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
(17 more...)

Genre:

Research Report > New Finding (0.68)
Personal (0.46)

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)

Add feedback

Appendices A Solving for Algebraically

Neural Information Processing SystemsAug-15-2025, 16:55:56 GMT

Causal traces show that the last token of the subject name is not always decisive.

large language model, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > Scotland (0.05)
Europe > Albania > Tirana County > Tirana (0.04)
Europe > Germany > Hesse > Darmstadt Region > Frankfurt (0.04)
(17 more...)

Genre:

Research Report > New Finding (0.68)
Personal (0.46)

Industry:

Leisure & Entertainment > Sports (1.00)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)

Add feedback

MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency

Zhang, Junzhe, Zhang, Huixuan, Yin, Xunjian, Huang, Baizhou, Zhang, Xu, Hu, Xinyu, Wan, Xiaojun

arXiv.org Artificial IntelligenceJun-19-2024

Multimodal large language models (MLLMs) are prone to non-factual or outdated knowledge issues, which can manifest as misreading and misrecognition errors due to the complexity of multimodal knowledge. Previous benchmarks have not systematically analyzed the performance of editing methods in correcting these two error types. To better represent and correct these errors, we decompose multimodal knowledge into its visual and textual components. Different error types correspond to different editing formats, which edits distinct part of the multimodal knowledge. We present MC-MKE, a fine-grained Multimodal Knowledge Editing benchmark emphasizing Modality Consistency. Our benchmark facilitates independent correction of misreading and misrecognition errors by editing the corresponding knowledge component. We evaluate three multimodal knowledge editing methods on MC-MKE, revealing their limitations, particularly in terms of modality consistency. Our work highlights the challenges posed by multimodal knowledge editing and motivates further research in developing effective techniques for this task.

editing, knowledge, multimodal knowledge, (16 more...)

arXiv.org Artificial Intelligence

2406.13219

Country:

Europe > United Kingdom (0.05)
South America > Argentina (0.04)
North America > United States (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning

Li, Yichuan, Ma, Xiyao, Lu, Sixing, Lee, Kyumin, Liu, Xiaohu, Guo, Chenlei

arXiv.org Artificial IntelligenceMar-12-2024

Large Language models (LLMs) have demonstrated impressive in-context learning (ICL) capabilities, where a LLM makes predictions for a given test input together with a few input-output pairs (demonstrations). Nevertheless, the inclusion of demonstrations leads to a quadratic increase in the computational overhead of the self-attention mechanism. Existing solutions attempt to distill lengthy demonstrations into compact vectors. However, they often require task-specific retraining or compromise LLM's in-context learning performance. To mitigate these challenges, we present Meta dEmonstratioN Distillation (MEND), where a language model learns to distill any lengthy demonstrations into vectors without retraining for a new downstream task. We exploit the knowledge distillation to enhance alignment between MEND and LLM, achieving both efficiency and effectiveness simultaneously. MEND is endowed with the meta-knowledge of distilling demonstrations through a two-stage training process, which includes meta-distillation pretraining and fine-tuning. Comprehensive evaluations across seven diverse ICL task partitions using decoder-only (GPT-2) and encoder-decoder (T5) attest to MEND's prowess. It not only matches but often outperforms the Vanilla ICL as well as other state-of-the-art distillation models, while significantly reducing the computational demands. This innovation promises enhanced scalability and efficiency for the practical deployment of large language models

demonstration, distillation, mend, (14 more...)

arXiv.org Artificial Intelligence

2403.06914

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Cross-lingual Editing in Multilingual Language Models

Beniwal, Himanshu, D, Kowsik Nandagopan, Singh, Mayank

arXiv.org Artificial IntelligenceFeb-3-2024

The training of large language models (LLMs) necessitates substantial data and computational resources, and updating outdated LLMs entails significant efforts and resources. While numerous model editing techniques (METs) have emerged to efficiently update model outputs without retraining, their effectiveness in multilingual LLMs, where knowledge is stored in diverse languages, remains an underexplored research area. This research paper introduces the cross-lingual model editing (\textbf{XME}) paradigm, wherein a fact is edited in one language, and the subsequent update propagation is observed across other languages. To investigate the XME paradigm, we conducted experiments using BLOOM, mBERT, and XLM-RoBERTa using the two writing scripts: \textit{Latin} (English, French, and Spanish) and \textit{Indic} (Hindi, Gujarati, and Bengali). The results reveal notable performance limitations of state-of-the-art METs under the XME setting, mainly when the languages involved belong to two distinct script families. These findings highlight the need for further research and development of XME techniques to address these challenges. For more comprehensive information, the dataset used in this research and the associated code are publicly available at the following URL\url{https://github.com/lingo-iitgn/XME}.

dataset, fr es hi gu bn, gu 52, (14 more...)

arXiv.org Artificial Intelligence

2401.10521

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > India > Gujarat > Gandhinagar (0.04)
(8 more...)

Genre: Research Report (0.81)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Massive Editing for Large Language Models via Meta Learning

Tan, Chenmien, Zhang, Ge, Fu, Jie

arXiv.org Artificial IntelligenceJan-24-2024

While large language models (LLMs) have enabled learning knowledge from the pre-training corpora, the acquired knowledge may be fundamentally incorrect or outdated over time, which necessitates rectifying the knowledge of the language model (LM) after the training. A promising approach involves employing a hyper-network to generate parameter shift, whereas existing hyper-networks suffer from inferior scalability in synchronous editing operation amount. To mitigate the problem, we propose the MAssive Language Model Editing Network (MALMEN), which formulates the parameter shift aggregation as the least square problem, subsequently updating the LM parameters using the normal equation. To accommodate editing multiple facts simultaneously with limited memory budgets, we separate the computation on the hyper-network and LM, enabling arbitrary batch size on both neural networks. Our method is evaluated by editing up to thousands of facts on LMs with different architectures, i.e., BERT-base, GPT-2, T5-XL (2.8B), and GPT-J (6B), across various knowledge-intensive NLP tasks, i.e., closed book fact-checking and question answering. Remarkably, MALMEN is capable of editing hundreds of times more facts than strong baselines with the identical hyper-network architecture and outperforms editor specifically designed for GPT. Our code is available at https://github.com/ChenmienTan/malmen.

parameter shift, proceedings, tuple, (16 more...)

arXiv.org Artificial Intelligence

2311.04661

Country:

North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > France (0.04)
Asia > China > Beijing > Beijing (0.04)
(3 more...)

Genre: Research Report (0.85)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TempAMLSI : Temporal Action Model Learning based on Grammar Induction

Grand, Maxence, Pellier, Damien, Fiorino, Humbert

arXiv.org Artificial IntelligenceDec-8-2021

Hand-encoding PDDL domains is generally accepted as difficult, tedious and error-prone. The difficulty is even greater when temporal domains have to be encoded. Indeed, actions have a duration and their effects are not instantaneous. In this paper, we present TempAMLSI, an algorithm based on the AMLSI approach able to learn temporal domains. TempAMLSI is based on the classical assumption done in temporal planning that it is possible to convert a non-temporal domain into a temporal domain. TempAMLSI is the first approach able to learn temporal domain with single hard envelope and Cushing's intervals. We show experimentally that TempAMLSI is able to learn accurate temporal domains, i.e., temporal domain that can be used directly to solve new planning problem, with different forms of action concurrency.

durative action, precondition, temporal domain, (15 more...)

arXiv.org Artificial Intelligence

2112.04286

Country:

North America > United States > Oklahoma > Payne County > Cushing (0.26)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Africa > Senegal > Louga Region > Louga (0.04)
(5 more...)

Genre:

Research Report (0.50)
Workflow (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fast Model Editing at Scale

Mitchell, Eric, Lin, Charles, Bosselut, Antoine, Finn, Chelsea, Manning, Christopher D.

arXiv.org Artificial IntelligenceOct-21-2021

While large pre-trained models have enabled impressive results on a variety of downstream tasks, the largest existing models still make errors, and even accurate predictions may become outdated over time. Because detecting all such failures at training time is impossible, enabling both developers and end users of such models to correct inaccurate outputs while leaving the model otherwise intact is desirable. However, the distributed, black-box nature of the representations learned by large neural networks makes producing such targeted edits difficult. If presented with only a single problematic input and new desired output, fine-tuning approaches tend to overfit; other editing algorithms are either computationally infeasible or simply ineffective when applied to very large models. To enable easy post-hoc editing at scale, we propose Model Editor Networks with Gradient Decomposition (MEND), a collection of small auxiliary editing networks that use a single desired input-output pair to make fast, local edits to a pre-trained model. MEND learns to transform the gradient obtained by standard fine-tuning, using a low-rank decomposition of the gradient to make the parameterization of this transformation tractable. MEND can be trained on a single GPU in less than a day even for 10 billion parameter models; once trained MEND enables rapid application of new edits to the pre-trained model. Our experiments with T5, GPT, BERT, and BART models show that MEND is the only approach to model editing that produces effective edits for models with tens of millions to over 10 billion parameters. Increasingly large neural networks have become a fundamental tool in solving data-driven problems in computer vision (Huang et al., 2017) and natural language processing (Vaswani et al., 2017) in particular. However, a key challenge in deploying and maintaining such models is issuing patches to adjust model behavior after deployment (Sinitsin et al., 2020).

gradient, loc, mend, (16 more...)

arXiv.org Artificial Intelligence

2110.11309

Country:

Europe > United Kingdom (0.69)
Asia > India (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(14 more...)

Genre: Research Report (0.64)

Industry: Government > Regional Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Marketing intelligence could mend a broken business

#artificialintelligenceJan-16-2019, 02:26:41 GMT

Data has become the most valuable currency in business. But without the right tools or intelligence, its true value will not be realised. According to a MiQ survey, 43 per cent of US and UK brand marketers think that the lack of measurement of business impact, such as sales or growth, is the main hurdle to investing more in data analytics. But if marketing metrics are not the same as business goals, why are campaigns measured against them? Marketing should align with the same goals as the rest of the company, in order to measure tangible business results.

creativity & intelligence, intelligence, machine learning, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.45)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.45)

Add feedback

The Fashion House Of Artificial Intelligence

#artificialintelligenceMar-24-2017, 00:55:06 GMT

We've been relying on computers, their analytics, and their algorithms to give us deeper, broader, and faster knowledge of what people want for a long time now. Surveys, browser cookies, user data, and sales trends all tell us an incredible amount of detail. Why not apply this same logic to fashion? Fashion subscription service Stitch Fix decided to try it last year, and the human-measured results are in: computers are really good designers. Stitch Fix's computers identified shirt cuts, patterns, and sleeve styles popular among the company's subscribers, and mashed them together with some human help to create three brand new shirts.

artificial intelligence, creativity, stitch fix, (8 more...)

#artificialintelligence

Country: North America > United States > New York (0.07)

Industry: Textiles, Apparel & Luxury Goods (0.32)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback