AITopics | kite

Collaborating Authors

kite

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KITE: A Benchmark for Evaluating Korean Instruction-Following Abilities in Large Language Models

Kim, Dongjun, Park, Chanhee, Park, Chanjun, Lim, Heuiseok

arXiv.org Artificial IntelligenceOct-20-2025

The instruction-following capabilities of large language models (LLMs) are pivotal for numerous applications, from conversational agents to complex reasoning systems. However, current evaluations predominantly focus on English models, neglecting the linguistic and cultural nuances of other languages. Specifically, Korean, with its distinct syntax, rich morphological features, honorific system, and dual numbering systems, lacks a dedicated benchmark for assessing open-ended instruction-following capabilities. To address this gap, we introduce the Korean Instruction-following Task Evaluation (KITE), a comprehensive benchmark designed to evaluate both general and Korean-specific instructions. Unlike existing Korean benchmarks that focus mainly on factual knowledge or multiple-choice testing, KITE directly targets diverse, open-ended instruction-following tasks. Our evaluation pipeline combines automated metrics with human assessments, revealing performance disparities across models and providing deeper insights into their strengths and weaknesses. By publicly releasing the KITE dataset and code, we aim to foster further research on culturally and linguistically inclusive LLM development and inspire similar endeavors for other underrepresented languages.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.15558

Country: Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Indigenous knowledge meets artificial intelligence

MIT Technology ReviewAug-15-2025, 10:00:00 GMT

Suzanne Kite's AI art installations, for example, model a Lakota framework of data sovereignty: intelligence that emerges only through reciprocal, consensual interaction. Unlike systems that assume user consent via opaque terms of service, her kinetic machines require the viewer's physical presence--and give something back in return. I know exactly what I did to train it. It's not a large model but a small and intimate one," Kite says. "I'm not particularly interested in making the most technologically advanced anything.

artist, extraction, indigenous knowledge meet artificial intelligence, (1 more...)

MIT Technology Review

Country:

North America > United States > New York (0.08)
North America > United States > California (0.08)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Enhancing Generalization in Data-free Quantization via Mixup-class Prompting

Park, Jiwoong, Lee, Chaeun, Choi, Yongseok, Park, Sein, Hong, Deokki, Choi, Jungwook

arXiv.org Artificial IntelligenceJul-30-2025

Post-training quantization (PTQ) improves efficiency but struggles with limited calibration data, especially under privacy constraints. Data-free quantization (DFQ) mitigates this by generating synthetic images using generative models such as generative adversarial networks (GANs) and text-conditioned latent diffusion models (LDMs), while applying existing PTQ algorithms. However, the relationship between generated synthetic images and the generalizability of the quantized model during PTQ remains underexplored. Without investigating this relationship, synthetic images generated by previous prompt engineering methods based on single-class prompts suffer from issues such as polysemy, leading to performance degradation. We propose \textbf{mixup-class prompt}, a mixup-based text prompting strategy that fuses multiple class labels at the text prompt level to generate diverse, robust synthetic data. This approach enhances generalization, and improves optimization stability in PTQ. We provide quantitative insights through gradient norm and generalization error analysis. Experiments on convolutional neural networks (CNNs) and vision transformers (ViTs) show that our method consistently outperforms state-of-the-art DFQ methods like GenQ. Furthermore, it pushes the performance boundary in extremely low-bit scenarios, achieving new state-of-the-art accuracy in challenging 2-bit weight, 4-bit activation (W2A4) quantization.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2507.21947

Country: Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Foreign-Object Detection in High-Voltage Transmission Line Based on Improved YOLOv8m

Wang, Zhenyue, Yuan, Guowu, Zhou, Hao, Ma, Yi, Ma, Yutang

arXiv.org Artificial IntelligenceFeb-10-2025

The safe operation of high-voltage transmission lines ensures the power grid's security. Various foreign objects attached to the transmission lines, such as balloons, kites and nesting birds, can significantly affect the safe and stable operation of high-voltage transmission lines. With the advancement of computer vision technology, periodic automatic inspection of foreign objects is efficient and necessary. Existing detection methods have low accuracy because foreign objects at-tached to the transmission lines are complex, including occlusions, diverse object types, significant scale variations, and complex backgrounds. In response to the practical needs of the Yunnan Branch of China Southern Power Grid Co., Ltd., this paper proposes an improved YOLOv8m-based model for detecting foreign objects on transmission lines. Experiments are conducted on a dataset collected from Yunnan Power Grid. The proposed model enhances the original YOLOv8m by in-corporating a Global Attention Module (GAM) into the backbone to focus on occluded foreign objects, replacing the SPPF module with the SPPCSPC module to augment the model's multiscale feature extraction capability, and introducing the Focal-EIoU loss function to address the issue of high- and low-quality sample imbalances. These improvements accelerate model convergence and enhance detection accuracy. The experimental results demonstrate that our proposed model achieves a 2.7% increase in mAP_0.5, a 4% increase in mAP_0.5:0.95, and a 6% increase in recall.

artificial intelligence, machine learning, transmission line, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/app132312775

2502.07175

Country:

Asia > China > Yunnan Province > Kunming (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(9 more...)

Genre: Research Report > New Finding (0.66)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Harvesting energy from turbulent winds with Reinforcement Learning

Basile, Lorenzo, Berni, Maria Grazia, Celani, Antonio

arXiv.org Artificial IntelligenceDec-18-2024

Airborne Wind Energy (AWE) is an emerging technology designed to harness the power of high-altitude winds, offering a solution to several limitations of conventional wind turbines. AWE is based on flying devices (usually gliders or kites) that, tethered to a ground station and driven by the wind, convert its mechanical energy into electrical energy by means of a generator. Such systems are usually controlled by manoeuvering the kite so as to follow a predefined path prescribed by optimal control techniques, such as model-predictive control. These methods are strongly dependent on the specific model at use and difficult to generalize, especially in unpredictable conditions such as the turbulent atmospheric boundary layer. Our aim is to explore the possibility of replacing these techniques with an approach based on Reinforcement Learning (RL). Unlike traditional methods, RL does not require a predefined model, making it robust to variability and uncertainty. Our experimental results in complex simulated environments demonstrate that AWE agents trained with RL can effectively extract energy from turbulent flows, relying on minimal local information about the kite orientation and speed relative to the wind.

kite, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2412.13961

Country: Europe > Italy (0.14)

Genre: Research Report (0.82)

Industry:

Energy > Renewable > Wind (0.89)
Energy > Oil & Gas > Upstream (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Retrieval-Augmented Semantic Parsing: Using Large Language Models to Improve Generalization

Zhang, Xiao, Meng, Qianru, Bos, Johan

arXiv.org Artificial IntelligenceDec-13-2024

Open-domain semantic parsing remains a challenging task, as models often rely on heuristics and struggle to handle unseen concepts. In this paper, we investigate the potential of large language models (LLMs) for this task and introduce Retrieval-Augmented Semantic Parsing (RASP), a simple yet effective approach that integrates external lexical knowledge into the parsing process. Our experiments not only show that LLMs outperform previous encoder-decoder baselines for semantic parsing, but that RASP further enhances their ability to predict unseen concepts, nearly doubling the performance of previous models on out-of-distribution concepts. These findings highlight the promise of leveraging large language models and retrieval mechanisms for robust and open-domain semantic parsing.

computational linguistic, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.10207

Country:

Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Italy > Liguria > Genoa (0.04)
(16 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

SAGA: A Participant-specific Examination of Story Alternatives and Goal Applicability for a Deeper Understanding of Complex Events

Vallurupalli, Sai, Erk, Katrin, Ferraro, Francis

arXiv.org Artificial IntelligenceAug-11-2024

Interpreting and assessing goal driven actions is vital to understanding and reasoning over complex events. It is important to be able to acquire the knowledge needed for this understanding, though doing so is challenging. We argue that such knowledge can be elicited through a participant achievement lens. We analyze a complex event in a narrative according to the intended achievements of the participants in that narrative, the likely future actions of the participants, and the likelihood of goal success. We collect 6.3K high quality goal and action annotations reflecting our proposed participant achievement lens, with an average weighted Fleiss-Kappa IAA of 80%. Our collection contains annotated alternate versions of each narrative. These alternate versions vary minimally from the "original" story, but can license drastically different inferences. Our findings suggest that while modern large language models can reflect some of the goal-based knowledge we study, they find it challenging to fully capture the design and intent behind concerted actions, even when the model pretraining included the data from which we extracted the goal knowledge. We show that smaller models fine-tuned on our dataset can achieve performance surpassing larger models.

alternative story, annotation, participant, (16 more...)

arXiv.org Artificial Intelligence

2408.05793

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Oceania > New Zealand (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(15 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Health & Medicine (0.93)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs

Yang, Rui, Ding, Ruomeng, Lin, Yong, Zhang, Huan, Zhang, Tong

arXiv.org Artificial IntelligenceJun-14-2024

Reward models trained on human preference data have been proven to be effective for aligning Large Language Models (LLMs) with human intent within the reinforcement learning from human feedback (RLHF) framework. However, the generalization capabilities of current reward models to unseen prompts and responses are limited. This limitation can lead to an unexpected phenomenon known as reward over-optimization, where excessive optimization of rewards results in a decline in actual performance. While previous research has advocated for constraining policy optimization, our study proposes a novel approach to enhance the reward model's generalization ability against distribution shifts by regularizing the hidden states. Specifically, we retain the base model's language model head and incorporate a suite of text-generation losses to preserve the hidden states' text generation capabilities, while concurrently learning a reward head behind the same hidden states. Our experimental results demonstrate that the introduced regularization technique markedly improves the accuracy of learned reward models across a variety of out-of-distribution (OOD) tasks and effectively alleviate the over-optimization issue in RLHF, offering a more reliable and robust preference learning paradigm.

arxiv preprint arxiv, reward model, selamat intere, (14 more...)

arXiv.org Artificial Intelligence

2406.10216

Country:

North America > United States > Tennessee > Davidson County > Nashville (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dataflow-Aware PIM-Enabled Manycore Architecture for Deep Learning Workloads

Sharma, Harsh, Narang, Gaurav, Doppa, Janardhan Rao, Ogras, Umit, Pande, Partha Pratim

arXiv.org Artificial IntelligenceMar-27-2024

Processing-in-memory (PIM) has emerged as an enabler for the energy-efficient and high-performance acceleration of deep learning (DL) workloads. Resistive random-access memory (ReRAM) is one of the most promising technologies to implement PIM. However, as the complexity of Deep convolutional neural networks (DNNs) grows, we need to design a manycore architecture with multiple ReRAM-based processing elements (PEs) on a single chip. Existing PIM-based architectures mostly focus on computation while ignoring the role of communication. ReRAM-based tiled manycore architectures often involve many Processing Elements (PEs), which need to be interconnected via an efficient on-chip communication infrastructure. Simply allocating more resources (ReRAMs) to speed up only computation is ineffective if the communication infrastructure cannot keep up with it. In this paper, we highlight the design principles of a dataflow-aware PIM-enabled manycore platform tailor-made for various types of DL workloads. We consider the design challenges with both 2.5D interposer- and 3D integration-enabled architectures.

architecture, chiplet, fabrication cost, (14 more...)

arXiv.org Artificial Intelligence

2403.19073

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Washington > Whitman County > Pullman (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

KITE: Keypoint-Conditioned Policies for Semantic Manipulation

Sundaresan, Priya, Belkhale, Suneel, Sadigh, Dorsa, Bohg, Jeannette

arXiv.org Artificial IntelligenceOct-11-2023

While natural language offers a convenient shared interface for humans and robots, enabling robots to interpret and follow language commands remains a longstanding challenge in manipulation. A crucial step to realizing a performant instruction-following robot is achieving semantic manipulation, where a robot interprets language at different specificities, from high-level instructions like "Pick up the stuffed animal" to more detailed inputs like "Grab the left ear of the elephant." To tackle this, we propose Keypoints + Instructions to Execution (KITE), a two-step framework for semantic manipulation which attends to both scene semantics (distinguishing between different objects in a visual scene) and object semantics (precisely localizing different parts within an object instance). KITE first grounds an input instruction in a visual scene through 2D image keypoints, providing a highly accurate object-centric bias for downstream action inference. Provided an RGB-D scene observation, KITE then executes a learned keypoint-conditioned skill to carry out the instruction. The combined precision of keypoints and parameterized skills enables fine-grained manipulation with generalization to scene and object variations. Empirically, we demonstrate KITE in 3 real-world environments: long-horizon 6-DoF tabletop manipulation, semantic grasping, and a high-precision coffee-making task. In these settings, KITE achieves a 75%, 70%, and 71% overall success rate for instruction-following, respectively. KITE outperforms frameworks that opt for pre-trained visual language models over keypoint-based grounding, or omit skills in favor of end-to-end visuomotor control, all while being trained from fewer or comparable amounts of demonstrations. Supplementary material, datasets, code, and videos can be found on our website: http://tinyurl.com/kite-site.

artificial intelligence, keypoint-conditioned policy, semantic manipulation, (1 more...)

arXiv.org Artificial Intelligence

2306.16605

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback