AITopics | afd

Collaborating Authors

afd

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding

StepFun, null, :, null, Wang, Bin, Wang, Bojun, Wan, Changyi, Huang, Guanzhe, Hu, Hanpeng, Jia, Haonan, Nie, Hao, Li, Mingliang, Chen, Nuo, Chen, Siyu, Yuan, Song, Xie, Wuxun, Song, Xiaoniu, Chen, Xing, Yang, Xingping, Zhang, Xuelin, Yu, Yanbo, Wang, Yaoyu, Zhu, Yibo, Jiang, Yimin, Zhou, Yu, Lu, Yuanwei, Li, Houyi, Hu, Jingcheng, Lo, Ka Man, Huang, Ailin, Jiao, Binxing, Li, Bo, Chen, Boyu, Miao, Changxin, Lou, Chang, Hu, Chen, Xu, Chen, Yu, Chenfeng, Yao, Chengyuan, Lv, Daokuan, Shi, Dapeng, Sun, Deshan, Huang, Ding, Hu, Dingyuan, Pang, Dongqing, Liu, Enle, Zhang, Fajie, Wan, Fanqi, Yan, Gulin, Zhang, Han, Zhou, Han, Wu, Hanghao, Guo, Hangyu, Chen, Hanqi, Zhang, Hanshan, Wu, Hao, Zhang, Haocheng, Yan, Haolong, Lv, Haoran, Wei, Haoran, Zhou, Hebin, Wang, Heng, Wang, Heng, Li, Hongxin, Zhou, Hongyu, Wang, Hongyuan, Guo, Huiyong, Wang, Jia, Gong, Jiahao, Xie, Jialing, Zhou, Jian, Sun, Jianjian, Wu, Jiaoren, Zhang, Jiaran, Liu, Jiayu, Cheng, Jie, Luo, Jie, Yan, Jie, Yang, Jie, Hou, Jieyi, Zhang, Jinguang, Cao, Jinlan, Yin, Jisheng, Liu, Junfeng, Huang, Junhao, Lin, Junzhe, Tan, Kaijun, Li, Kaixiang, An, Kang, Lin, Kangheng, Liu, Kenkun, Yang, Lei, Zhao, Liang, Chen, Liangyu, Shi, Lieyu, Tan, Liguo, Lin, Lin, Zhang, Lin, Chen, Lina, Huang, Liwen, Shi, Liying, Gu, Longlong, Chen, Mei, Ren, Mengqiang, Li, Ming, Chen, Mingzhe, Wang, Na, Wu, Nan, Han, Qi, Zhao, Qian, Zhang, Qiang, Liu, Qianni, Chen, Qiaohui, Wu, Qiling, He, Qinglin, Tan, Qinyuan, Wang, Qiufeng, Wu, Qiuping, Liang, Qiuyan, Sun, Quan, Li, Rui, Miao, Ruihang, Wan, Ruosi, Guo, Ruyan, Zhong, Shangwu, Pang, Shaoliang, Fan, Shengjie, Shang, Shijie, Jiang, Shilei, Yang, Shiliang, Hao, Shiming, Gao, Shuli, Huang, Siming, Liu, Siqi, Cao, Tiancheng, Cheng, Tianhao, Peng, Tianhao, You, Wang, Ji, Wei, Sun, Wen, Deng, Wenjin, He, Wenqing, Zheng, Wenzhen, Chen, Xi, Kong, Xiangwen, Luo, Xianzhen, Yang, Xiaobo, Liu, Xiaojia, Ren, Xiaoxiao, Han, Xin, Li, Xin, Wu, Xin, Zhao, Xu, Wei, Yanan, Li, Yang, Li, Yangguang, Xu, Yangshijie, Xu, Yanming, Shi, Yaqiang, Shen, Yeqing, Yang, Yi, Yang, Yifei, Gong, Yifeng, Chen, Yihan, Yang, Yijing, Zhang, Yinmin, Zhou, Yizhuang, Ding, Yuanhao, Fan, Yuantao, Yang, Yuanzhen, Luo, Yuchu, Peng, Yue, Lu, Yufan, Deng, Yuhang, Yin, Yuhe, Liu, Yujie, Chen, Yukun, Zhao, Yuling, Mou, Yun, Li, Yunlong, Ju, Yunzhou, Li, Yusheng, Yang, Yuxiang, Zhang, Yuxiang, Chen, Yuyang, Weng, Zejia, Xie, Zhe, Ge, Zheng, Gong, Zheng, Lu, Zhenyi, Huang, Zhewei, Chang, Zhichao, Huang, Zhiguo, Wang, Zhirui, Yang, Zidong, Wang, Zili, Wang, Ziqi, Zhang, Zixin, Jiao, Binxing, Jiang, Daxin, Shum, Heung-Yeung, Zhang, Xiangyu

arXiv.org Artificial IntelligenceJul-28-2025

Large language models (LLMs) face low hardware efficiency during decoding, especially for long-context reasoning tasks. This paper introduces Step-3, a 321B-parameter VLM with hardware-aware model-system co-design optimized for minimizing decoding costs. Step-3 innovates in two key dimensions: (1) A novel Multi-Matrix Factorization Attention (MFA) mechanism that significantly reduces both KV cache size and computation while maintaining high attention expressiveness, and (2) Attention-FFN Disaggregation (AFD), a distributed inference system that decouples attention and Feed-Forward Network (FFN) layers into specialized subsystems. This co-design achieves unprecedented cost efficiency: Step-3 significantly reduces theoretical decoding costs compared with models like DeepSeek-V3 and Qwen3 MoE 235B, with the gains widening at longer context. Step-3 achieves low cost while activating 38B parameters per token (more than DeepSeek-V3 and Qwen3 MoE 235B), demonstrating that hardware-aligned attention arithmetic intensity, MoE sparsity, and AFD are critical to cost-effectiveness. We perform a head-to-head comparison with DeepSeek-V3 in its favorable scenarios. Our implementation on Hopper GPUs achieves a decoding throughput of up to 4,039 tokens per second per GPU under 50ms TPOT SLA (4K context, FP8, no MTP). It is higher than DeepSeek-V3's 2,324 in the same setup and sets a new Pareto frontier for LLM decoding.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2507.19427

Genre: Workflow (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Alignment Helps Make the Most of Multimodal Data

Arnold, Christian, Küpfer, Andreas

arXiv.org Artificial IntelligenceJul-8-2024

When studying political communication, combining the information from text, audio, and video signals promises to reflect the richness of human communication more comprehensively than confining it to individual modalities alone. However, its heterogeneity, connectedness, and interaction are challenging to address when modeling such multimodal data. We argue that aligning the respective modalities can be an essential step in entirely using the potential of multimodal data because it informs the model with human understanding. Taking care of the data-generating process of multimodal data, our framework proposes four principles to organize alignment and, thus, address the challenges of multimodal data. We illustrate the utility of these principles by analyzing how German MPs address members of the far-right AfD in their speeches and predicting the tone of video advertising in the context of the 2020 US presidential race. Our paper offers important insights to all keen to analyze multimodal data effectively.

alignment, modality, multimodal data, (15 more...)

arXiv.org Artificial Intelligence

2405.08454

Country:

Europe > Western Europe (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry: Government > Voting & Elections (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment

Sun, Hao, van der Schaar, Mihaela

arXiv.org Artificial IntelligenceMay-24-2024

Aligning Large Language Models (LLMs) is crucial for enhancing their safety and utility. However, existing methods, primarily based on preference datasets, face challenges such as noisy labels, high annotation costs, and privacy concerns. In this work, we introduce Alignment from Demonstrations (AfD), a novel approach leveraging high-quality demonstration data to overcome these challenges. We formalize AfD within a sequential decision-making framework, highlighting its unique challenge of missing reward signals. Drawing insights from forward and inverse reinforcement learning, we introduce divergence minimization objectives for AfD. Analytically, we elucidate the mass-covering and mode-seeking behaviors of various approaches, explaining when and why certain methods are superior. Practically, we propose a computationally efficient algorithm that extrapolates over a tailored reward model for AfD. We validate our key insights through experiments on the Harmless and Helpful tasks, demonstrating their strong empirical performance while maintaining simplicity.

arxiv preprint arxiv, dataset, reward model, (10 more...)

arXiv.org Artificial Intelligence

2405.15624

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Mitigating Feature Gap for Adversarial Robustness by Feature Disentanglement

Zhou, Nuoyan, Zhou, Dawei, Liu, Decheng, Gao, Xinbo, Wang, Nannan

arXiv.org Artificial IntelligenceJan-26-2024

Deep neural networks are vulnerable to adversarial samples. Adversarial fine-tuning methods aim to enhance adversarial robustness through fine-tuning the naturally pre-trained model in an adversarial training manner. However, we identify that some latent features of adversarial samples are confused by adversarial perturbation and lead to an unexpectedly increasing gap between features in the last hidden layer of natural and adversarial samples. To address this issue, we propose a disentanglement-based approach to explicitly model and further remove the latent features that cause the feature gap. Specifically, we introduce a feature disentangler to separate out the latent features from the features of the adversarial samples, thereby boosting robustness by eliminating the latent features. Besides, we align features in the pre-trained model with features of adversarial samples in the fine-tuned model, to further benefit from the features from natural samples without confusion. Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.

accuracy, adversarial sample, robustness, (13 more...)

arXiv.org Artificial Intelligence

2401.14707

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre:

Research Report (0.64)
Overview > Growing Problem (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Fault-Tolerant Offline Multi-Agent Path Planning

Okumura, Keisuke, Tixeuil, Sébastien

arXiv.org Artificial IntelligenceNov-25-2022

We study a novel graph path planning problem for multiple agents that may crash at runtime, and block part of the workspace. In our setting, agents can detect neighboring crashed agents, and change followed paths at runtime. The objective is then to prepare a set of paths and switching rules for each agent, ensuring that all correct agents reach their destinations without collisions or deadlocks, despite unforeseen crashes of other agents. Such planning is attractive to build reliable multi-robot systems. We present problem formalization, theoretical analysis such as computational complexities, and how to solve this offline planning problem.

agent, artificial intelligence, planning & scheduling, (18 more...)

arXiv.org Artificial Intelligence

2211.13908

Country:

Europe > France (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Is Appearance Free Action Recognition Possible?

Ilic, Filip, Pock, Thomas, Wildes, Richard P.

arXiv.org Artificial IntelligenceJul-13-2022

Intuition might suggest that motion and dynamic information are key to video-based action recognition. In contrast, there is evidence that state-of-the-art deep-learning video understanding architectures are biased toward static information available in single frames. Presently, a methodology and corresponding dataset to isolate the effects of dynamic information in video are missing. Their absence makes it difficult to understand how well contemporary architectures capitalize on dynamic vs. static information. We respond with a novel Appearance Free Dataset (AFD) for action recognition. AFD is devoid of static information relevant to action recognition in a single frame. Modeling of the dynamics is necessary for solving the task, as the action is only apparent through consideration of the temporal dimension. We evaluated 11 contemporary action recognition architectures on AFD as well as its related RGB video. Our results show a notable decrease in performance for all architectures on AFD compared to RGB. We also conducted a complimentary study with humans that shows their recognition accuracy on AFD and RGB is very similar and much better than the evaluated architectures on AFD. Our results motivate a novel architecture that revives explicit recovery of optical flow, within a contemporary design for best performance on AFD and RGB.

architecture, computer vision, recognition, (11 more...)

arXiv.org Artificial Intelligence

2207.06261

Country:

Europe > Austria > Styria > Graz (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > Ontario (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Generating Gameplay-Relevant Art Assets with Transfer Learning

Gonzalez, Adrian, Guzdial, Matthew, Ramos, Felix

arXiv.org Artificial IntelligenceOct-4-2020

In game development, designing compelling visual assets that convey gameplay-relevant features requires time and experience. Recent image generation methods that create high-quality content could reduce development costs, but these approaches do not consider game mechanics. We propose a Convolutional Variational Autoencoder (CVAE) system to modify and generate new game visuals based on their gameplay relevance. We test this approach with Pok\'emon sprites and Pok\'emon type information, since types are one of the game's core mechanics and they directly impact the game's visuals. Our experimental results indicate that adopting a transfer learning approach can help to improve visual quality and stability over unseen data.

dataset, mon, type information, (15 more...)

arXiv.org Artificial Intelligence

2010.01681

Country:

North America > Canada > Alberta (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Mexico > Jalisco > Guadalajara (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Feature-map-level Online Adversarial Knowledge Distillation

Chung, Inseop, Park, SeongUk, Kim, Jangho, Kwak, Nojun

arXiv.org Artificial IntelligenceFeb-5-2020

Feature maps contain rich information about image intensity and spatial correlation. However, previous online knowledge distillation methods only utilize the class probabilities. Thus in this paper, we propose an online knowledge distillation method that transfers not only the knowledge of the class probabilities but also that of the feature map using the adversarial training framework. We train multiple networks simultaneously by employing discriminators to distinguish the feature map distributions of different networks. Each network has its corresponding discriminator which discriminates the feature map from its own as fake while classifying that of the other network as real. By training a network to fool the corresponding discriminator, it can learn the other network's feature map distribution. We show that our method performs better than the conventional direct alignment method such as L1 and is more suitable for online distillation. Also, we propose a novel cyclic learning scheme for training more than two networks together. We have applied our method to various network architectures on the classification task and discovered a significant improvement of performance especially in the case of training a pair of a small network and a large one.

distillation, feature map, knowledge, (16 more...)

arXiv.org Artificial Intelligence

2002.01775

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications > Networks (0.89)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)

Add feedback

Automatic State Abstraction from Demonstration

Cobo, Luis Carlos (Georgia Institute of Technology) | Zang, Peng (Georgia Institute of Technology) | Jr., Charles Lee Isbell (Georgia Institute of Technology) | Thomaz, Andrea Lockerd (Georgia Institute of Technology)

AAAI ConferencesJul-19-2011

Learning from Demonstration (LfD) is a popular technique for building decision-making agents from human help. Traditional LfD methods use demonstrations as training examples for supervised learning, but complex tasks can require more examples than is practical to obtain. We present Abstraction from Demonstration (AfD), a novel form of LfD that uses demonstrations to infer state abstractions and reinforcement learning (RL) methods in those abstract state spaces to build a policy. Empirical results show that AfD is greater than an order of magnitude more sample efficient than jus tusing demonstrations as training examples, and exponentially faster than RL alone.

afd, algorithm, demonstration, (14 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback