AITopics | query group

Collaborating Authors

query group

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Referring Human Pose and Mask Estimation in the Wild Bo Miao

Neural Information Processing SystemsFeb-12-2026, 22:31:07 GMT

This new task holds significant potential for human-centric applications such as assistive robotics and sports analysis.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Oceania > Australia > Western Australia (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

Referring Human Pose and Mask Estimation in the Wild Bo Miao

Neural Information Processing SystemsOct-10-2025, 02:13:45 GMT

This new task holds significant potential for human-centric applications such as assistive robotics and sports analysis.

estimation, pose estimation, segmentation, (14 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Oceania > Australia > Western Australia (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

JurisTCU: A Brazilian Portuguese Information Retrieval Dataset with Query Relevance Judgments

Fernandes, Leandro Carísio, Ribeiro, Leandro dos Santos, de Castro, Marcos Vinícius Borela, Pacheco, Leonardo Augusto da Silva, Sandes, Edans Flávius de Oliveira

arXiv.org Artificial IntelligenceMar-11-2025

This paper introduces JurisTCU, a Brazilian Portuguese dataset for legal information retrieval (LIR). The dataset is freely available and consists of 16,045 jurisprudential documents from the Brazilian Federal Court of Accounts, along with 150 queries annotated with relevance judgments. It addresses the scarcity of Portuguese-language LIR datasets with query relevance annotations. The queries are organized into three groups: real user keyword-based queries, synthetic keyword-based queries, and synthetic question-based queries. Relevance judgments were produced through a hybrid approach combining LLM-based scoring with expert domain validation. We used JurisTCU in 14 experiments using lexical search (document expansion methods) and semantic search (BERT-based and OpenAI embeddings). We show that the document expansion methods significantly improve the performance of standard BM25 search on this dataset, with improvements exceeding 45% in P@10, R@10, and nDCG@10 metrics when evaluating short keyword-based queries. Among the embedding models, the OpenAI models produced the best results, with improvements of approximately 70% in P@10, R@10, and nDCG@10 metrics for short keyword-based queries, suggesting that these dense embeddings capture semantic relationships in this domain, surpassing the reliance on lexical terms. Besides offering a dataset for the Portuguese-language IR research community, suitable for evaluating search systems, the results also contribute to enhancing a search system highly relevant to Brazilian citizens.

bm25, dataset, query, (16 more...)

arXiv.org Artificial Intelligence

2503.08379

Country:

South America > Brazil > Federal District > Brasília (0.04)
South America > Brazil > Rio Grande do Sul (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Law (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.47)

Add feedback

Question-Answering System Extracts Information on Injection Drug Use from Clinical Notes

Mahbub, Maria, Goethert, Ian, Danciu, Ioana, Knight, Kathryn, Srinivasan, Sudarshan, Tamang, Suzanne, Rozenberg-Ben-Dror, Karine, Solares, Hugo, Martins, Susana, Trafton, Jodie, Begoli, Edmon, Peterson, Gregory

arXiv.org Artificial IntelligenceDec-28-2023

Background: Injection drug use (IDU) is a dangerous health behavior that increases mortality and morbidity. Identifying IDU early and initiating harm reduction interventions can benefit individuals at risk. However, extracting IDU behaviors from patients' electronic health records (EHR) is difficult because there is no International Classification of Disease (ICD) code and the only place IDU information can be indicated is unstructured free-text clinical notes. Although natural language processing can efficiently extract this information from unstructured data, there are no validated tools. Methods: To address this gap in clinical information, we design and demonstrate a question-answering (QA) framework to extract information on IDU from clinical notes. Our framework involves two main steps: (1) generating a gold-standard QA dataset and (2) developing and testing the QA model. We utilize 2323 clinical notes of 1145 patients sourced from the VA Corporate Data Warehouse to construct the gold-standard dataset for developing and evaluating the QA model. We also demonstrate the QA model's ability to extract IDU-related information on temporally out-of-distribution data. Results: Here we show that for a strict match between gold-standard and predicted answers, the QA model achieves 51.65% F1 score. For a relaxed match between the gold-standard and predicted answers, the QA model obtains 78.03% F1 score, along with 85.38% Precision and 79.02% Recall scores. Moreover, the QA model demonstrates consistent performance when subjected to temporally out-of-distribution data. Conclusions: Our study introduces a QA framework designed to extract IDU information from clinical notes, aiming to enhance the accurate and efficient detection of people who inject drugs, extract relevant information, and ultimately facilitate informed patient care.

clinical note, information, qa model, (15 more...)

arXiv.org Artificial Intelligence

2305.08777

Country:

North America > United States > Tennessee > Knox County > Knoxville (0.14)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

Hou, Charlie, Thekumparampil, Kiran Koshy, Shavlovsky, Michael, Fanti, Giulia, Dattatreya, Yesh, Sanghavi, Sujay

arXiv.org Artificial IntelligenceNov-9-2023

While deep learning (DL) models are state-of-the-art in text and image domains, they have not yet consistently outperformed Gradient Boosted Decision Trees (GBDTs) on tabular Learning-To-Rank (LTR) problems. Most of the recent performance gains attained by DL models in text and image tasks have used unsupervised pretraining, which exploits orders of magnitude more unlabeled data than labeled data. To the best of our knowledge, unsupervised pretraining has not been applied to the LTR problem, which often produces vast amounts of unlabeled data. In this work, we study whether unsupervised pretraining of deep models can improve LTR performance over GBDTs and other non-pretrained models. By incorporating simple design choices--including SimCLR-Rank, an LTR-specific pretraining loss--we produce pretrained deep learning models that consistently (across datasets) outperform GBDTs (and other non-pretrained rankers) in the case where there is more unlabeled data than labeled data. This performance improvement occurs not only on average but also on outlier queries. We base our empirical conclusions off of experiments on (1) public benchmark tabular LTR datasets, and (2) a large industry-scale proprietary ranking dataset. Code is provided at https://anonymous.4open.science/r/ltr-pretrain-0DAD/README.md.

dataset, query group, simclr-rank, (12 more...)

arXiv.org Artificial Intelligence

2308.00177

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Retail (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Toward Understanding Privileged Features Distillation in Learning-to-Rank

Yang, Shuo, Sanghavi, Sujay, Rahmanian, Holakou, Bakus, Jan, Vishwanathan, S. V. N.

arXiv.org Artificial IntelligenceSep-19-2022

In learning-to-rank problems, a privileged feature is one that is available during model training, but not available at test time. Such features naturally arise in merchandised recommendation systems; for instance, "user clicked this item" as a feature is predictive of "user purchased this item" in the offline data, but is clearly not available during online serving. Another source of privileged features is those that are too expensive to compute online but feasible to be added offline. Privileged features distillation (PFD) refers to a natural idea: train a "teacher" model using all features (including privileged ones) and then use it to train a "student" model that does not use the privileged features. In this paper, we first study PFD empirically on three public ranking datasets and an industrial-scale ranking problem derived from Amazon's logs. We show that PFD outperforms several baselines (no-distillation, pretraining-finetuning, self-distillation, and generalized distillation) on all these datasets. Next, we analyze why and when PFD performs well via both empirical ablation studies and theoretical analysis for linear models. Both investigations uncover an interesting non-monotone behavior: as the predictive power of a privileged feature increases, the performance of the resulting student model initially increases but then decreases. We show the reason for the later decreasing performance is that a very predictive privileged teacher produces predictions with high variance, which lead to high variance student estimates and inferior testing performance.

artificial intelligence, machine learning, privileged feature, (17 more...)

arXiv.org Artificial Intelligence

2209.08754

Genre: Research Report > New Finding (0.46)

Industry: Education (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Group-based Query Learning for rapid diagnosis in time-critical situations

Bellala, Gowtham, Bhavnani, Suresh, Scott, Clayton

arXiv.org Machine LearningNov-24-2009

In query learning, the goal is to identify an unknown object while minimizing the number of "yes or no" questions (queries) posed about that object. We consider three extensions of this fundamental problem that are motivated by practical considerations in real-world, time-critical identification tasks such as emergency response. First, we consider the problem where the objects are partitioned into groups, and the goal is to identify only the group to which the object belongs. Second, we address the situation where the queries are partitioned into groups, and an algorithm may suggest a group of queries to a human user, who then selects the actual query. Third, we consider the problem of query learning in the presence of persistent query noise, and relate it to group identification. To address these problems we show that a standard algorithm for query learning, known as the splitting algorithm or generalized binary search, may be viewed as a generalization of Shannon-Fano coding. We then extend this result to the group-based settings, leading to new algorithms. The performance of our algorithms is demonstrated on simulated data and on a database used by first responders for toxic chemical identification.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

0911.4511

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.63)

Industry: