AITopics | conda

Collaborating Authors

conda

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conda: Column-Normalized Adam for Training Large Language Models Faster

Wang, Junjie, Zhou, Pan, Dong, Yiming, Li, Huan, Li, Jia, Zhou, Xun, Lao, Qicheng, Fang, Cong, Lin, Zhouchen

arXiv.org Artificial IntelligenceOct-1-2025

Large language models (LLMs) have demonstrated impressive generalization and emergent capabilities, yet their pre-training remains computationally expensive and sensitive to optimization dynamics. While Adam-based optimizers offer fast convergence by adapting learning rates coordinate-wise, recent studies reveal that their updates often suffer from poor spectral conditioning and low-rank structures, hindering efficiency. Muon addresses this issue via global spectral normalization but lacks the per-coordinate adaptivity of Adam. In this work, we propose Column-Normalized Adam (Conda), a novel optimizer that bridges the strengths of both approaches. Conda projects updates into an orthogonal subspace and applies column-wise second moment normalization based on the projected gradients, thereby achieving both improved spectral conditioning and maintaining coordinate-wise adaptivity. This design alleviates the spectral pathologies of Adam while preserving its fast convergence behavior. Extensive experiments on the LLaMA and GPT-2 series show that Conda consistently outperforms AdamW, Muon, and other baselines in pre-training. Remarkably, on the LLaMA series, Conda achieves 2-2.5 the convergence speed of AdamW, measured in both training steps and training time. Further ablations demonstrate its robustness under diverse training setups. These results collectively highlight Conda as an effective and broadly applicable optimizer for large-scale LLM training. The code is released on https://github.com/jie040109/Conda

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.24218

Country: Asia > China (0.67)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts

Choi, Jihye, Raghuram, Jayaram, Li, Yixuan, Jha, Somesh

arXiv.org Artificial IntelligenceDec-18-2024

Advancements in foundation models (FMs) have led to a paradigm shift in machine learning. The rich, expressive feature representations from these pre-trained, large-scale FMs are leveraged for multiple downstream tasks, usually via lightweight fine-tuning of a shallow fully-connected network following the representation. However, the non-interpretable, black-box nature of this prediction pipeline can be a challenge, especially in critical domains such as healthcare, finance, and security. In this paper, we explore the potential of Concept Bottleneck Models (CBMs) for transforming complex, non-interpretable foundation models into interpretable decision-making pipelines using high-level concept vectors. Specifically, we focus on the test-time deployment of such an interpretable CBM pipeline "in the wild", where the input distribution often shifts from the original training distribution. We first identify the potential failure modes of such a pipeline under different types of distribution shifts. Then we propose an adaptive concept bottleneck framework to address these failure modes, that dynamically adapts the concept-vector bank and the prediction layer based solely on unlabeled data from the target domain, without access to the source (training) dataset. Empirical evaluations with various real-world distribution shifts show that our adaptation method produces concept-based interpretations better aligned with the test data and boosts post-deployment accuracy by up to 28%, aligning the CBM performance with that of non-interpretable classification.

artificial intelligence, machine learning, prediction, (14 more...)

arXiv.org Artificial Intelligence

2412.14097

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode

Tian, Yuxing, Qi, Yiyan, Jiang, Aiwen, Huang, Qi, Guo, Jian

arXiv.org Artificial IntelligenceJul-11-2024

Continuous-Time Dynamic Graph (CTDG) precisely models evolving real-world relationships, drawing heightened interest in dynamic graph learning across academia and industry. However, existing CTDG models encounter challenges stemming from noise and limited historical data. Graph Data Augmentation (GDA) emerges as a critical solution, yet current approaches primarily focus on static graphs and struggle to effectively address the dynamics inherent in CTDGs. Moreover, these methods often demand substantial domain expertise for parameter tuning and lack theoretical guarantees for augmentation efficacy. To address these issues, we propose Conda, a novel latent diffusion-based GDA method tailored for CTDGs. Conda features a sandwich-like architecture, incorporating a Variational Auto-Encoder (VAE) and a conditional diffusion model, aimed at generating enhanced historical neighbor embeddings for target nodes. Unlike conventional diffusion models trained on entire graphs via pre-training, Conda requires historical neighbor sequence embeddings of target nodes for training, thus facilitating more targeted augmentation. We integrate Conda into the CTDG model and adopt an alternating training strategy to optimize performance. Extensive experimentation across six widely used real-world datasets showcases the consistent performance improvement of our approach, particularly in scenarios with limited historical data.

ctdg model, dataset, node, (13 more...)

arXiv.org Artificial Intelligence

2407.085

Country:

Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre:

Research Report (0.64)
Instructional Material (0.47)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

ConDA: Contrastive Domain Adaptation for AI-generated Text Detection

Bhattacharjee, Amrita, Kumarage, Tharindu, Moraffah, Raha, Liu, Huan

arXiv.org Artificial IntelligenceSep-20-2023

Large language models (LLMs) are increasingly being used for generating text in a variety of use cases, including journalistic news articles. Given the potential malicious nature in which these LLMs can be used to generate disinformation at scale, it is important to build effective detectors for such AI-generated text. Given the surge in development of new LLMs, acquiring labeled training data for supervised detectors is a bottleneck. However, there might be plenty of unlabeled text data available, without information on which generator it came from. In this work we tackle this data problem, in detecting AI-generated news text, and frame the problem as an unsupervised domain adaptation task. Here the domains are the different text generators, i.e. LLMs, and we assume we have access to only the labeled source data and unlabeled target data. We develop a Contrastive Domain Adaptation framework, called ConDA, that blends standard domain adaptation techniques with the representation power of contrastive learning to learn domain invariant representations that are effective for the final unsupervised detection task. Our experiments demonstrate the effectiveness of our framework, resulting in average performance gains of 31.7% from the best performing baselines, and within 0.8% margin of a fully supervised detector. All our code and data is available at https://github.com/AmritaBh/ConDA-gen-text-detection.

ai-generated text detection, conda, contrastive domain adaptation

arXiv.org Artificial Intelligence

2309.03992

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Explaining and Adapting Graph Conditional Shift

Zhu, Qi, Jiao, Yizhu, Ponomareva, Natalia, Han, Jiawei, Perozzi, Bryan

arXiv.org Artificial IntelligenceJun-5-2023

Graph Neural Networks (GNNs) have shown remarkable performance on graph-structured data. However, recent empirical studies suggest that GNNs are very susceptible to distribution shift. There is still significant ambiguity about why graph-based models seem more vulnerable to these shifts. In this work we provide a thorough theoretical analysis on it by quantifying the magnitude of conditional shift between the input features and the output label. Our findings show that both graph heterophily and model architecture exacerbate conditional shifts, leading to performance degradation. To address this, we propose an approach that involves estimating and minimizing the conditional shift for unsupervised domain adaptation on graphs. In our controlled synthetic experiments, our algorithm demonstrates robustness towards distribution shift, resulting in up to 10% absolute ROC AUC improvement versus the second-best algorithm. Furthermore, comprehensive experiments on both node classification and graph classification show its robust performance under various distribution shifts.

artificial intelligence, conditional shift, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2306.03256

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

How to run Segment Anything on Linux

#artificialintelligenceApr-10-2023, 12:25:35 GMT

This post shows how to run Segment Anything Model (SAM) developed by Meta Research. SAM is explained like below. The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image. It has been trained on a dataset of 11 million images and 1.1 billion masks, and has strong zero-shot performance on a variety of segmentation tasks. How SAM works is explained in the blog below.

download, sam, strawberry, (9 more...)

#artificialintelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)

Technology:

Information Technology > Software > Programming Languages (0.40)
Information Technology > Artificial Intelligence (0.39)

Add feedback

ConDA: Unsupervised Domain Adaptation for LiDAR Segmentation via Regularized Domain Concatenation

Kong, Lingdong, Quader, Niamul, Liong, Venice Erin

arXiv.org Artificial IntelligenceApr-6-2023

Transferring knowledge learned from the labeled source domain to the raw target domain for unsupervised domain adaptation (UDA) is essential to the scalable deployment of autonomous driving systems. State-of-the-art methods in UDA often employ a key idea: utilizing joint supervision signals from both source and target domains for self-training. In this work, we improve and extend this aspect. We present ConDA, a concatenation-based domain adaptation framework for LiDAR segmentation that: 1) constructs an intermediate domain consisting of fine-grained interchange signals from both source and target domains without destabilizing the semantic coherency of objects and background around the ego-vehicle; and 2) utilizes the intermediate domain for self-training. To improve the network training on the source domain and self-training on the intermediate domain, we propose an anti-aliasing regularizer and an entropy aggregator to reduce the negative effect caused by the aliasing artifacts and noisy pseudo labels. Through extensive studies, we demonstrate that ConDA significantly outperforms prior arts in mitigating domain gaps.

artificial intelligence, computer vision, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2111.15242

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation > Ground > Road (0.55)
Transportation > Infrastructure & Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.67)

Add feedback

Managing Machine Learning Lifecycles with MLflow

#artificialintelligenceDec-18-2022, 19:55:10 GMT

Model development and experimentation is part of any machine learning lifecycle. However, without careful planning, keeping track of experiments can become tedious and challenging; especially given the number of configurations we typically deal with. MLflow is a machine learning lifecycle framework that allows ML engineers and teams to keep track of their experiments. In PART 1 of the series, we are going to focus on the first two steps -- tracking experiments and sharing code. PART 2 will be dedicated to model packaging, while PART 3 will show how the concepts outlined in the previous parts can be used in a React web application. For now, let's try to understand what MLflow is, and what it can do for us!

directory, experiment, mlflow, (14 more...)

#artificialintelligence

Industry:

Education (1.00)
Information Technology > Services (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

In-game Toxic Language Detection: Shared Task and Attention Residuals

Jia, Yuanzhe, Wu, Weixuan, Cao, Feiqi, Han, Soyeon Caren

arXiv.org Artificial IntelligenceNov-19-2022

In-game toxic language becomes the hot potato in the gaming industry and community. There have been several online game toxicity analysis frameworks and models proposed. However, it is still challenging to detect toxicity due to the nature of in-game chat, which has extremely short length. In this paper, we describe how the in-game toxic language shared task has been established using the real-world in-game chat data. In addition, we propose and introduce the model/framework for toxic language token tagging (slot filling) from the in-game chat. The data and code will be released.

machine learning, natural language, utterance, (12 more...)

arXiv.org Artificial Intelligence

2211.05995

Country: Oceania > Australia > New South Wales > Sydney (0.05)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.58)

Add feedback

GitHub - RubensZimbres/best-of-ml-python: 🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

#artificialintelligenceSep-9-2022, 19:20:40 GMT

A ranked list of awesome machine learning Python libraries. This curated list contains 830 awesome open-source projects with a total of 2.6M stars grouped into 32 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Discover other best-of lists or create your own.

conda, github, pypi, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback