AITopics

doi: 10.1016/j.ins.2025.122787

2504.1462

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-2-2025, 21:38:30 GMT

major questions raised by the reviewers. 1 Learning rates. To address the reviewers ' comments on learning rates, we will add results with easy-to-implement

We thank the reviewers for very helpful comments. To address the reviewers' comments on learning rates, we will add results with More specifically, this requires two changes: (1) the epoch length needs to keep increasing (i.e. at the end of every Proof of Theorem 5. We sketch the proof for the piecewise choice (1), which follows easily from our Theorem 1. We will clarify this in the revision. Given that |S||A| is often enormous in practice, our theory potentially leads to a notable improvement. ": See the response above on "learning rates". Q-update and (2) choosing δ to be sufficiently small. We will add this in the revision. We will clarify this in the revision to avoid confusion. ": See the response above on "learning rates".

artificial intelligence, machine learning, reviewer, (16 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsOct-2-2025, 15:02:13 GMT

3261769be720b0fefbfffec05e9d9202-AuthorFeedback.pdf

We thank all reviewers for very helpful comments. This letter addresses the major questions raised by the reviewers. Please see the response below for "distribution assumptions" and "global null and We will correct our references and typos in the table. We shall elaborate more in our revised version to make these more clear. To conquer this issue, we provided high-probability guarantees in Section 2 and 3. Please also see Please see below for "global null and group of coefficients" and "more discussions We will elaborate more in our revised version when the eigen-spetra are not as nicely behaved.

artificial intelligence, coefficient, machine learning, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Abaskohi, Amirhossein, Chen, Tianyi, Muñoz-Mármol, Miguel, Fox, Curtis, Ramesh, Amrutha Varshini, Marcotte, Étienne, Lù, Xing Han, Chapados, Nicolas, Gella, Spandana, Pal, Christopher, Drouin, Alexandre, Laradji, Issam H.

DRBench: A Realistic Benchmark for Enterprise Deep Research

arXiv.org Artificial IntelligenceOct-2-2025

We introduce DRBench, a benchmark for evaluating AI agents on complex, open-ended deep research tasks in enterprise settings. Unlike prior benchmarks that focus on simple questions or web-only queries, DRBench evaluates agents on multi-step queries (for example, ``What changes should we make to our product roadmap to ensure compliance with this standard?") that require identifying supporting facts from both the public web and private company knowledge base. Each task is grounded in realistic user personas and enterprise context, spanning a heterogeneous search space that includes productivity software, cloud file systems, emails, chat conversations, and the open web. Tasks are generated through a carefully designed synthesis pipeline with human-in-the-loop verification, and agents are evaluated on their ability to recall relevant insights, maintain factual accuracy, and produce coherent, well-structured reports. We release 15 deep research tasks across 10 domains, such as Sales, Cybersecurity, and Compliance. We demonstrate the effectiveness of DRBench by evaluating diverse DR agents across open- and closed-source models (such as GPT, Llama, and Qwen) and DR strategies, highlighting their strengths, weaknesses, and the critical path for advancing enterprise deep research. Code is available at https://github.com/ServiceNow/drbench.

information, large language model, machine learning, (23 more...)

2510.00172

Country: North America > Canada (0.45)

Genre:

Research Report > New Finding (1.00)
Workflow (0.93)

Industry:

Retail (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Neural Information Processing SystemsAug-22-2025, 00:28:22 GMT

We will add a series of nu-2 merical experiments to demonstrate the minimax optimality of the model-3

We thank all reviewers for very helpful comments. This letter addresses several major questions raised by the reviewers. Indeed, reward perturbation is introduced merely to facilitate analysis. Take Section 4.3 of the Arxiv version We will elucidate the motivation and intuition of reward perturbation earlier on in the revised paper. We understand from the reviewer's comment that there might be confusion in our This will be made clear in the final paper.

experiment, perturbation, reward perturbation, (14 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.43)

Neural Information Processing SystemsAug-20-2025, 02:23:33 GMT

address each reviewer's specific questions in turn. 3 Reply to R1

We thank all three reviewers for their detailed and thoughtful reviews. "How about if the slopes differ?" Per your feedback, we ran new experiments where the slopes differ. "Do the players learn from previous experience?" We do not model the player's learning but plan to in future work.

baseline, classification accuracy, mutual information loss, (13 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.50)

Technology: Information Technology > Artificial Intelligence > Games (0.31)

Neural Information Processing SystemsAug-15-2025, 20:43:33 GMT

some specific questions, but will incorporate all feedback in the final version

We thank the reviewers for their careful reading and insightful comments. We will add this in the final version. Transformer-based) models to further shrink the search space. Number of nodes in the graphs seems to be quite low ( 200 for GNMT). Is there some manual grouping operation performed on the computational graph?

algorithm, cost model, graph, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.37)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.32)

BBC NewsNov-17-2024, 00:13:50 GMT

'Do not pet': Why are robot dogs patrolling Mar-A-Lago?

Video of Spot strutting around the property has gone viral on TikTok - where reactions range from calling them cool and cute, to creepy - and become fodder for jokes on American late night television. But its mission is no laughing matter. "Safeguarding the president-elect is a top priority," said Anthony Guglielmi, US Secret Service chief of communications, in a statement to the BBC. In the months leading up to the US presidential election, Trump was the target of two apparent assassination attempts. The first took place at a July rally in Butler, Pennsylvania and the other occurred at the Mar-a-Lago golf course in September.

robot dog patrolling mar-a-lago, secret service, specific question, (2 more...)

BBC News

Country:

North America > United States > Florida > Palm Beach County > Palm Beach (0.66)
North America > United States > Pennsylvania (0.30)

Industry: Government > Voting & Elections (0.66)

Technology:

Information Technology > Communications > Social Media (0.92)
Information Technology > Artificial Intelligence > Robots (0.62)

Singh, Ayush, Gupta, Mansi, Garg, Shivank

Are VLMs Really Blind

arXiv.org Artificial IntelligenceOct-29-2024

Vision Language Models excel in handling a wide range of complex tasks, including Optical Character Recognition (OCR), Visual Question Answering (VQA), and advanced geometric reasoning. However, these models fail to perform well on low-level basic visual tasks which are especially easy for humans. Our goal in this work was to determine if these models are truly "blind" to geometric reasoning or if there are ways to enhance their capabilities in this area. Our work presents a novel automatic pipeline designed to extract key information from images in response to specific questions. Instead of just relying on direct VQA, we use question-derived keywords to create a caption that highlights important details in the image related to the question. This caption is then used by a language model to provide a precise answer to the question without requiring external fine-tuning.

caption, keyword, vlm, (16 more...)

2410.22029

Country: Asia > India > Uttarakhand > Roorkee (0.05)

Genre: Research Report > New Finding (0.96)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Saba, Walid, Wendelken, Suzanne, Shanahan, James.

Question-Answering Based Summarization of Electronic Health Records using Retrieval Augmented Generation

arXiv.org Artificial IntelligenceJan-2-2024

Summarization of electronic health records (EHRs) can substantially minimize 'screen time' for both patients as well as medical personnel. In recent years summarization of EHRs have employed machine learning pipelines using state of the art neural models. However, these models have produced less than adequate results that are attributed to the difficulty of obtaining sufficient annotated data for training. Moreover, the requirement to consider the entire content of an EHR in summarization has resulted in poor performance due to the fact that attention mechanisms in modern large language models (LLMs) adds a quadratic complexity in terms of the size of the input. We propose here a method that mitigates these shortcomings by combining semantic search, retrieval augmented generation (RAG) and question-answering using the latest LLMs. In our approach summarization is the extraction of answers to specific questions that are deemed important by subject-matter experts (SMEs). Our approach is quite efficient; requires minimal to no training; does not suffer from the 'hallucination' problem of LLMs; and it ensures diversity, since the summary will not have repeated content but diverse answers to specific questions.

electronic health record, specific question, summarization, (11 more...)

2401.01469

Country:

North America > United States > Maine > Cumberland County > Portland (0.06)
Asia > Middle East > Israel (0.05)

Genre: Research Report (0.66)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)