AITopics | data item

Faster Algorithms for User-Level Private Stochastic Convex Optimization

Neural Information Processing SystemsMar-22-2026, 01:38:43 GMT

We study private stochastic convex optimization (SCO) under user-level differential privacy (DP) constraints. In this setting, there are $n$ users (e.g., cell phones), each possessing $m$ data items (e.g., text messages), and we need to protect the privacy of each user's entire collection of data items. Existing algorithms for user-level DP SCO are impractical in many large-scale machine learning scenarios because: (i) they make restrictive assumptions on the smoothness parameter of the loss function and require the number of users to grow polynomially with the dimension of the parameter space; or (ii) they are prohibitively slow, requiring at least $(mn)^{3/2}$ gradient computations for smooth losses and $(mn)^3$ computations for non-smooth losses. To address these limitations, we provide novel user-level DP algorithms with state-of-the-art excess risk and runtime guarantees, without stringent assumptions. First, we develop a linear-time algorithm with state-of-the-art excess risk (for a non-trivial linear-time algorithm) under a mild smoothness assumption. Our second algorithm applies to arbitrary smooth losses and achieves optimal excess risk in $\approx (mn)^{9/8}$ gradient computations. Third, for non-smooth loss functions, we obtain optimal excess risk in $n^{11/8} m^{5/4}$ gradient computations. Moreover, our algorithms do not require the number of users to grow polynomially with the dimension.

artificial intelligence, machine learning, proceedings, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications > Mobile (0.59)
Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

BourGAN: Generative Networks with Metric Embeddings

Chang Xiao, Peilin Zhong, Changxi Zheng

Neural Information Processing SystemsMar-15-2026, 20:34:54 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, arxiv preprint arxiv, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

BourGAN: Generative Networks with Metric Embeddings

Chang Xiao, Peilin Zhong, Changxi Zheng

Neural Information Processing SystemsFeb-13-2026, 14:36:21 GMT

Neural Information Processing Systems http://nips.cc/

arxiv preprint arxiv, latent space, neural information processing system, (11 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Israel (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

8cf04c64d1734e5f7e63418a2a4d49de-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-10-2026, 16:45:20 GMT

benchmark, dataset, ontologue, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

07a4e20a7bbeeb7a736682b26b16ebe8-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 08:56:55 GMT

algorithm, barycenter, ensemble, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Nevada (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)
(13 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

Corpus Frequencies in Morphological Inflection: Do They Matter?

Sourada, Tomáš, Straková, Jana

arXiv.org Artificial IntelligenceOct-28-2025

The traditional approach to morphological inflection (the task of modifying a base word (lemma) to express grammatical categories) has been, for decades, to consider lexical entries of lemma-tag-form triples uniformly, lacking any information about their frequency distribution. However, in production deployment, one might expect the user inputs to reflect a real-world distribution of frequencies in natural texts. With future deployment in mind, we explore the incorporation of corpus frequency information into the task of morphological inflection along three key dimensions during system development: (i) for train-dev-test split, we combine a lemma-disjoint approach, which evaluates the model's generalization capabilities, with a frequency-weighted strategy to better reflect the realistic distribution of items across different frequency bands in training and test sets; (ii) for evaluation, we complement the standard type accuracy (often referred to simply as accuracy), which treats all items equally regardless of frequency, with token accuracy, which assigns greater weight to frequent words and better approximates performance on running text; (iii) for training data sampling, we introduce a method novel in the context of inflection, frequency-aware training, which explicitly incorporates word frequency into the sampling process. We show that frequency-aware training outperforms uniform sampling in 26 out of 43 languages.

accuracy, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.23131

Country:

Europe (1.00)
North America > Canada (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

07a4e20a7bbeeb7a736682b26b16ebe8-Paper.pdf

Neural Information Processing SystemsOct-1-2025, 23:58:29 GMT

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(14 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

8cf04c64d1734e5f7e63418a2a4d49de-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsAug-16-2025, 21:41:12 GMT

benchmark, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.93)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(3 more...)

Add feedback

Understanding Human Daily Experience Through Continuous Sensing: ETRI Lifelog Dataset 2024

Oh, Se Won, Jeong, Hyuntae, Chung, Seungeun, Lim, Jeong Mook, Noh, Kyoung Ju, Lee, Sunkyung, Jung, Gyuwon

arXiv.org Artificial IntelligenceAug-7-2025

--Improving human health and well-being requires an accurate and effective understanding of an individual's physical and mental state throughout daily life. T o support this goal, we utilized smartphones, smartwatches, and sleep sensors to collect data passively and continuously for 24 hours a day, with minimal interference to participants' usual behavior, enabling us to gather quantitative data on daily behaviors and sleep activities across multiple days. Additionally, we gathered subjective self-reports of participants' fatigue, stress, and sleep quality through surveys conducted immediately before and after sleep. This comprehensive lifelog dataset is expected to provide a foundational resource for exploring meaningful insights into human daily life and lifestyle patterns, and a portion of the data has been anonymized and made publicly available for further research. In this paper, we introduce the ETRI Lifelog Dataset 2024, detailing its structure and presenting potential applications, such as using machine learning models to predict sleep quality and stress. Human daily life consists of a complex interrelation of different activities and physiological states, spanning daytime behavior and nighttime sleep.

artificial intelligence, etri lifelog dataset 2024, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2508.03698

Country: Asia (0.28)

Genre:

Research Report > New Finding (0.69)
Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Expediting data extraction using a large language model (LLM) and scoping review protocol: a methodological study within a complex scoping review

Stewart-Evans, James, Wilson, Emma, Langley, Tessa, Prayle, Andrew, Hands, Angela, Exley, Karen, Leonardi-Bee, Jo

arXiv.org Artificial IntelligenceJul-10-2025

The data extraction stages of reviews are resource-intensive, and researchers may seek to expediate data extraction using online (large language models) LLMs and review protocols. Claude 3.5 Sonnet was used to trial two approaches that used a review protocol to prompt data extraction from 10 evidence sources included in a case study scoping review. A protocol-based approach was also used to review extracted data. Limited performance evaluation was undertaken which found high accuracy for the two extraction approaches (83.3% and 100%) when extracting simple, well-defined citation details; accuracy was lower (9.6% and 15.8%) when extracting more complex, subjective data items. Considering all data items, both approaches had precision >90% but low recall (<25%) and F1 scores (<40%). The context of a complex scoping review, open response types and methodological approach likely impacted performance due to missed and misattributed data. LLM feedback considered the baseline extraction accurate and suggested minor amendments: four of 15 (26.7%) to citation details and 8 of 38 (21.1%) to key findings data items were considered to potentially add value. However, when repeating the process with a dataset featuring deliberate errors, only 2 of 39 (5%) errors were detected. Review-protocol-based methods used for expediency require more robust performance evaluation across a range of LLMs and review contexts with comparison to conventional prompt engineering approaches. We recommend researchers evaluate and report LLM performance if using them similarly to conduct data extraction or review extracted data. LLM feedback contributed to protocol adaptation and may assist future review protocol drafting.

extraction, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.06623

Genre: