AITopics | context extraction

Collaborating Authors

context extraction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Grounded AI for Code Review: Resource-Efficient Large-Model Serving in Enterprise Pipelines

Mandal, Sayan, Jiang, Hua

arXiv.org Artificial IntelligenceOct-14-2025

Automated code review adoption lags in compliance-heavy settings, where static analyzers produce high-volume, low-rationale outputs, and naive LLM use risks hallucination and incurring cost overhead. We present a production system for grounded, PR-native review that pairs static-analysis findings with AST-guided context extraction and a single-GPU, on-demand serving stack (quantized open-weight model, multi-tier caching) to deliver concise explanations and remediation guidance. Evaluated on safety-oriented C/C++ standards, the approach achieves sub-minute median first-feedback (offline p50 build+LLM 59.8s) while maintaining competitive violation reduction and lower violation rates versus larger proprietary models. The architecture is decoupled: teams can adopt the grounding/prompting layer or the serving layer independently. A small internal survey (n=8) provides directional signals of reduced triage effort and moderate perceived grounding, with participants reporting fewer human review iterations. We outline operational lessons and limitations, emphasizing reproducibility, auditability, and pathways to broader standards and assisted patching.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.1029

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Persistent Pre-Training Poisoning of LLMs

Zhang, Yiming, Rando, Javier, Evtimov, Ivan, Chi, Jianfeng, Smith, Eric Michael, Carlini, Nicholas, Tramèr, Florian, Ippolito, Daphne

arXiv.org Artificial IntelligenceOct-17-2024

Large language models are pre-trained on uncurated text datasets consisting of trillions of tokens scraped from the Web. Prior work has shown that: (1) web-scraped pre-training datasets can be practically poisoned by malicious actors; and (2) adversaries can compromise language models after poisoning fine-tuning datasets. Our work evaluates for the first time whether language models can also be compromised during pre-training, with a focus on the persistence of pre-training attacks after models are fine-tuned as helpful and harmless chatbots (i.e., after SFT and DPO). We pre-train a series of LLMs from scratch to measure the impact of a potential poisoning adversary under four different attack objectives (denial-of-service, belief manipulation, jailbreaking, and prompt stealing), and across a wide range of model sizes (from 600M to 7B). Our main result is that poisoning only 0.1% of a model's pre-training dataset is sufficient for three out of four attacks to measurably persist through post-training. Moreover, simple attacks like denial-of-service persist through post-training with a poisoning rate of only 0.001%.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.13722

Country:

North America > United States > California (0.05)
North America > United States > Texas (0.04)
Europe > Russia (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bandhakavi

AAAI ConferencesFeb-8-2022, 09:57:55 GMT

Sentiment analysis is the computational study of opinionated text and is becoming increasing important to online commercial applications. However, the majority of current approaches determine sentiment by attempting to detect the overall polarity of a sentence, paragraph, or text window, but without any knowledge about the entities mentioned (e.g.

bandhakavi, classification performance, context extraction, (2 more...)

AAAI Conferences

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.76)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.76)

Add feedback