AITopics | transferred

Collaborating Authors

transferred

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TensorProgramsV: TuningLargeNeuralNetworksvia Zero-ShotHyperparameterTransfer

Neural Information Processing SystemsFeb-9-2026, 20:37:40 GMT

Manypublished baselines are hard to compare to one another due to varying degrees of HP tuning.

machine learning, natural language, urlhttp, (19 more...)

Neural Information Processing Systems

Country:

Europe > Italy > Tuscany > Florence (0.04)
Europe > Italy > Sardinia (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Activation Space Interventions Can Be Transferred Between Large Language Models

Oozeer, Narmeen, Nathawani, Dhruv, Prakash, Nirmalendu, Lan, Michael, Harrasse, Abir, Abdullah, Amirali

arXiv.org Artificial IntelligenceMar-6-2025

The study of representation universality in AI models reveals growing convergence across domains, modalities, and architectures. However, the practical applications of representation universality remain largely unexplored. We bridge this gap by demonstrating that safety interventions can be transferred between models through learned mappings of their shared activation spaces. We demonstrate this approach on two well-established AI safety tasks: backdoor removal and refusal of harmful prompts, showing successful transfer of steering vectors that alter the models' outputs in a predictable way. Additionally, we propose a new task, \textit{corrupted capabilities}, where models are fine-tuned to embed knowledge tied to a backdoor. This tests their ability to separate useful skills from backdoors, reflecting real-world challenges. Extensive experiments across Llama, Qwen and Gemma model families show that our method enables using smaller models to efficiently align larger ones. Furthermore, we demonstrate that autoencoder mappings between base and fine-tuned models can serve as reliable ``lightweight safety switches", allowing dynamic toggling between model behaviors.

activation space intervention, completion, transferred, (11 more...)

arXiv.org Artificial Intelligence

2503.04429

Country:

North America > United States > Arizona (0.04)
North America > United States > New York (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Personal (1.00)

Industry:

Media (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

KoopAGRU: A Koopman-based Anomaly Detection in Time-Series using Gated Recurrent Units

Yahia, Issam Ait, Berrada, Ismail

arXiv.org Artificial IntelligenceJan-31-2025

Anomaly detection in real-world time-series data is a challenging task due to the complex and nonlinear temporal dynamics involved. This paper introduces KoopAGRU, a new deep learning model designed to tackle this problem by combining Fast Fourier Transform (FFT), Deep Dynamic Mode Decomposition (DeepDMD), and Koopman theory. FFT allows KoopAGRU to decompose temporal data into time-variant and time-invariant components providing precise modeling of complex patterns. To better control these two components, KoopAGRU utilizes Gate Recurrent Unit (GRU) encoders to learn Koopman observables, enhancing the detection capability across multiple temporal scales. KoopAGRU is trained in a single process and offers fast inference times. Extensive tests on various benchmark datasets show that KoopAGRU outperforms other leading methods, achieving a new average F1-score of 90.88\% on the well-known anomalies detection task of times series datasets, and proves to be efficient and reliable in detecting anomalies in real-world scenarios.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.17976

Country:

North America > United States > New York > New York County > New York City (0.04)
Africa > Middle East > Morocco (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.93)
Information Technology (0.68)
Health & Medicine (0.67)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Transfer Learning of Surrogate Models: Integrating Domain Warping and Affine Transformations

Pan, Shuaiqun, Vermetten, Diederick, López-Ibáñez, Manuel, Bäck, Thomas, Wang, Hao

arXiv.org Artificial IntelligenceJan-30-2025

Surrogate models provide efficient alternatives to computationally demanding real-world processes but often require large datasets for effective training. A promising solution to this limitation is the transfer of pre-trained surrogate models to new tasks. Previous studies have investigated the transfer of differentiable and non-differentiable surrogate models, typically assuming an affine transformation between the source and target functions. This paper extends previous research by addressing a broader range of transformations, including linear and nonlinear variations. Specifically, we consider the combination of an unknown input warping, such as one modelled by the beta cumulative distribution function, with an unspecified affine transformation. Our approach achieves transfer learning by employing a limited number of data points from the target task to optimize these transformations, minimizing empirical loss on the transfer dataset. We validate the proposed method on the widely used Black-Box Optimization Benchmark (BBOB) testbed and a real-world transfer learning task from the automobile industry. The results underscore the significant advantages of the approach, revealing that the transferred surrogate significantly outperforms both the original surrogate and the one built from scratch using the transfer dataset, particularly in data-scarce scenarios.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2501.18344

Country:

Europe > Spain > Andalusia > Málaga Province > Málaga (0.05)
Europe > Netherlands > South Holland > Leiden (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Echoes of Biases: How Stigmatizing Language Affects AI Performance

Liu, Yizhi, Wang, Weiguang, Gao, Guodong Gordon, Agarwal, Ritu

arXiv.org Artificial IntelligenceJun-12-2023

Electronic health records (EHRs) serve as an essential data source for the envisioned artificial intelligence (AI)-driven transformation in healthcare. However, clinician biases reflected in EHR notes can lead to AI models inheriting and amplifying these biases, perpetuating health disparities. This study investigates the impact of stigmatizing language (SL) in EHR notes on mortality prediction using a Transformer-based deep learning model and explainable AI (XAI) techniques. Our findings demonstrate that SL written by clinicians adversely affects AI performance, particularly so for black patients, highlighting SL as a source of racial disparity in AI model development. To explore an operationally efficient way to mitigate SL's impact, we investigate patterns in the generation of SL through a clinicians' collaborative network, identifying central clinicians as having a stronger impact on racial disparity in the AI model. We find that removing SL written by central clinicians is a more efficient bias reduction strategy than eliminating all SL in the entire corpus of data. This study provides actionable insights for responsible AI development and contributes to understanding clinician behavior and EHR note writing in healthcare.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.10201

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Netherlands > South Holland > Leiden (0.04)
(9 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Multi-type Disentanglement without Adversarial Training

Sha, Lei, Lukasiewicz, Thomas

arXiv.org Artificial IntelligenceDec-16-2020

Controlling the style of natural language by disentangling the latent space is an important step towards interpretable machine learning. After the latent space is disentangled, the style of a sentence can be transformed by tuning the style representation without affecting other features of the sentence. Previous works usually use adversarial training to guarantee that disentangled vectors do not affect each other. However, adversarial methods are difficult to train. Especially when there are multiple features (e.g., sentiment, or tense, which we call style types in this paper), each feature requires a separate discriminator for extracting a disentangled style vector corresponding to that feature. In this paper, we propose a unified distribution-controlling method, which provides each specific style value (the value of style types, e.g., positive sentiment, or past tense) with a unique representation. This method contributes a solid theoretical basis to avoid adversarial training in multi-type disentanglement. We also propose multiple loss functions to achieve a style-content disentanglement as well as a disentanglement among multiple style types. In addition, we observe that if two different style types always have some specific style values that occur together in the dataset, they will affect each other when transferring the style values. We call this phenomenon training bias, and we propose a loss function to alleviate such training bias while disentangling multiple types. We conduct experiments on two datasets (Yelp service reviews and Amazon product reviews) to evaluate the style-disentangling effect and the unsupervised style transfer performance on two style types: sentiment and tense. The experimental results show the effectiveness of our model.

style type, style value, vector, (16 more...)

arXiv.org Artificial Intelligence

2012.08883

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > California > Ventura County > Thousand Oaks (0.04)
Europe > United Kingdom > England > Lancashire > Lancaster (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Google Reveals "What is being Transferred" in Transfer Learning

#artificialintelligenceAug-31-2020, 20:15:43 GMT

"Transfer Learning will be the next driver of Machine Learning Success"- Andrew NG Recently, researchers from Google proposed the solution of a very fundamental question in the machine learning community -- What is being transferred in Transfer Learning? They explained various tools and analyses to address the fundamental question. The ability to transfer the domain knowledge of one machine in which it is trained on to another where the data is usually scarce is one of the desired capabilities for machines. Researchers around the globe have been using transfer learning in various deep learning applications, including object detection, image classification, medical imaging tasks, among others. Despite these utilisations, there are cases found by several researchers where there is a nontrivial difference in visual forms between the source and the target domain.

artificial intelligence, machine learning, transfer learning, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback