AITopics | challenge

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports (0.69)
Leisure & Entertainment > Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Neural Information Processing SystemsDec-24-2025, 14:51:50 GMT

Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investigate the problem of federated hyperparameter tuning. We first identify key challenges and show how standard approaches may be adapted to form baselines for the federated setting. Then, by making a novel connection to the neural architecture search technique of weight-sharing, we introduce a new method, FedEx, to accelerate federated hyperparameter tuning that is applicable to widely-used federated optimization methods such as FedAvg and recent variants. Theoretically, we show that a FedEx variant correctly tunes the on-device learning rate in the setting of online convex optimization across devices. Empirically, we show that FedEx can outperform natural baselines for federated hyperparameter tuning by several percentage points on the Shakespeare, FEMNIST, and CIFAR-10 benchmarks--obtaining higher accuracy using the same training budget.

connection, federated hyperparameter tuning, name change, (7 more...)

Country: North America > United States > Virginia (0.07)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Communications of the ACMSep-24-2025, 20:50:40 GMT

Computing in the Arab World: Innovations, Challenges, and Advances amidst a Rich Mosaic of Scientific Activity

Membership in ACM includes a subscription to Communications of the ACM (CACM), the computing industry's most trusted source for staying connected to the world of advanced computing. The Regional Special Section of the Arab World highlights some of the region's exciting, innovative, and socially relevant advances in computing and its applications. It is with great pleasure that we present this Communications of the ACM Regional Special Section of the Arab World. In this second edition, we highlight some of the region's exciting, innovative, and socially relevant advances in computing and its applications. The Arab world is home to a rich mosaic of cultures, histories, and geographies, stretching from the Atlantic Ocean to the Gulf.

arab world, cacm, communications, (10 more...)

Communications of the ACM

Country:

Atlantic Ocean (0.25)
Asia > Middle East > Qatar (0.06)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.06)

Industry:

Media (0.33)
Information Technology (0.33)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.84)

Neural Information Processing SystemsMay-27-2025, 18:51:00 GMT

Most Influential Subset Selection: Challenges, Promises, and Beyond

How can we attribute the behaviors of machine learning models to their training data? While the classic influence function sheds light on the impact of individual samples, it often fails to capture the more complex and pronounced collective influence of a set of samples. To tackle this challenge, we study the Most Influential Subset Selection (MISS) problem, which aims to identify a subset of training samples with the greatest collective influence. We conduct a comprehensive analysis of the prevailing approaches in MISS, elucidating their strengths and weaknesses. Our findings reveal that influence-based greedy heuristics, a dominant class of algorithms in MISS, can provably fail even in linear regression.

challenge, collective influence, influential subset selection, (1 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsMay-27-2025, 13:04:49 GMT

A Taxonomy of Challenges to Curating Fair Datasets

Despite extensive efforts to create fairer machine learning (ML) datasets, there remains a limited understanding of the practical aspects of dataset curation. Drawing from interviews with 30 ML dataset curators, we present a comprehensive taxonomy of the challenges and trade-offs encountered throughout the dataset curation lifecycle. Our findings underscore overarching issues within the broader fairness landscape that impact data curation. We conclude with recommendations aimed at fostering systemic changes to better facilitate fair dataset curation practices.

challenge, curating fair dataset, taxonomy, (1 more...)

Genre: Research Report > New Finding (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Neural Information Processing SystemsMay-27-2025, 12:03:36 GMT

When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback

Past analyses of reinforcement learning from human feedback (RLHF) assume that the human evaluators fully observe the environment. What happens when human feedback is based only on partial observations? We formally define two failure cases: deceptive inflation and overjustification. Modeling the human as Boltzmann-rational w.r.t. a belief over trajectories, we prove conditions under which RLHF is guaranteed to result in policies that deceptively inflate their performance, overjustify their behavior to make an impression, or both. Under the new assumption that the human's partial observability is known and accounted for, we then analyze how much information the feedback process provides about the return function.

human feedback, partial observability, reinforcement learning, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMar-14-2025

Human Digital Twins in Personalized Healthcare: An Overview and Future Perspectives

Mokhtari, Melvin

This evolution indicates an expansion from industrial uses into diverse fields, including healthcare [61], [59]. The core functionalities of digital twins include an accurate mirroring of their physical counterparts, capturing all associated processes in a data-driven manner, maintaining a continuous connection that synchronizes with the real-time state of their physical twins, and simulating physical behavior for predictive analysis [85]. In the context of healthcare, a novel extension of this technology manifests in the form of Human Digital Twins (HDTs), designed to provide a comprehensive digital mirror of individual patients. HDTs not only represent physical attributes but also integrate dynamic changes across molecular, physiological, and behavioral dimensions. This advancement is aligned with a shift toward personalized healthcare (PH) paradigms, enabling tailored treatment strategies based on a patient's unique health profile, thereby enhancing preventive, diagnostic, and therapeutic processes in clinical settings [44], [50]. The personalization aspect of HDTs underscores their potential to revolutionize healthcare by facilitating precise and individualized treatment plans that optimize patient outcomes [72]. Although the potential of digital twins in healthcare has garnered much attention, practical applications remain newly developing, with critical literature highlighting that many implementations are still in exploratory stages [59]. Notably, institutions like the IEEE Computer Society and Gartner recognize this technology as a pivotal component in the ongoing evolution of healthcare systems that emphasize both precision and personalization [31], [89].

digital twin, healthcare, survey article, (14 more...)

2503.11944

Country: North America > Canada > Ontario > Hamilton (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Information Technology > Services (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(7 more...)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Internet of Things (1.00)
(13 more...)

van der Wal, Oskar, Lesci, Pietro, Muller-Eberstein, Max, Saphra, Naomi, Schoelkopf, Hailey, Zuidema, Willem, Biderman, Stella

PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

arXiv.org Artificial IntelligenceMar-12-2025

The stability of language model pre-training and its effects on downstream performance are still understudied. Prior work shows that the training process can yield significantly different results in response to slight variations in initial conditions, e.g., the random seed. Crucially, the research community still lacks sufficient resources and tools to systematically investigate pre-training stability, particularly for decoder-only language models. We introduce the PolyPythias, a set of 45 new training runs for the Pythia model suite: 9 new seeds across 5 model sizes, from 14M to 410M parameters, resulting in about 7k new checkpoints that we release. Using these new 45 training runs, in addition to the 5 already available, we study the effects of different initial conditions determined by the seed -- i.e., parameters' initialisation and data order -- on (i) downstream performance, (ii) learned linguistic representations, and (iii) emergence of training phases. In addition to common scaling behaviours, our analyses generally reveal highly consistent training dynamics across both model sizes and initial conditions. Further, the new seeds for each model allow us to identify outlier training runs and delineate their characteristics. Our findings show the potential of using these methods to predict training stability.

co eleutherai, computational linguistic, huggingface, (16 more...)

2503.09543

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Hawaii (0.14)
North America > Canada (0.14)
(8 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Tabarsi, Benyamin, Reichert, Heidi, Limke, Ally, Kuttal, Sandeep, Barnes, Tiffany

LLMs' Reshaping of People, Processes, Products, and Society in Software Development: A Comprehensive Exploration with Early Adopters

arXiv.org Artificial IntelligenceMar-6-2025

Large language models (LLMs) like OpenAI ChatGPT, Google Gemini, and GitHub Copilot are rapidly gaining traction in the software industry, but their full impact on software engineering remains insufficiently explored. Despite their growing adoption, there is a notable lack of formal, qualitative assessments of how LLMs are applied in real-world software development contexts. To fill this gap, we conducted semi-structured interviews with sixteen early-adopter professional developers to explore their use of LLMs throughout various stages of the software development life cycle. Our investigation examines four dimensions: people - how LLMs affect individual developers and teams; process - how LLMs alter software engineering workflows; product - LLM impact on software quality and innovation; and society - the broader socioeconomic and ethical implications of LLM adoption. Thematic analysis of our data reveals that while LLMs have not fundamentally revolutionized the development process, they have substantially enhanced routine coding tasks, including code generation, refactoring, and debugging. Developers reported the most effective outcomes when providing LLMs with clear, well-defined problem statements, indicating that LLMs excel with decomposed problems and specific requirements. Furthermore, these early-adopters identified that LLMs offer significant value for personal and professional development, aiding in learning new languages and concepts. Early-adopters, highly skilled in software engineering and how LLMs work, identified early and persisting challenges for software engineering, such as inaccuracies in generated content and the need for careful manual review before integrating LLM outputs into production environments. Our study provides a nuanced understanding of how LLMs are shaping the landscape of software development, with their benefits, limitations, and ongoing implications.

chatgpt, participant, survey article, (15 more...)

2503.05012

Country: North America > United States > North Carolina (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
(3 more...)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Higher Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Ghanizadeh, Mohammad Amin, Dousti, Mohammad Javad

Towards Data-Efficient Language Models: A Child-Inspired Approach to Language Learning

arXiv.org Artificial IntelligenceMar-6-2025

In this work, we explain our approach employed in the BabyLM Challenge, which uses various methods of training language models (LMs) with significantly less data compared to traditional large language models (LLMs) and are inspired by how human children learn. While a human child is exposed to far less linguistic input than an LLM, they still achieve remarkable language understanding and generation abilities. To this end, we develop a model trained on a curated dataset consisting of 10 million words, primarily sourced from child-directed transcripts. The 2024 BabyLM Challenge initial dataset of 10M words is filtered to 8.5M. Next, it is supplemented with a randomly selected subset of TVR dataset consisting of 1.5M words of television dialogues. The latter dataset ensures that similar to children, the model is also exposed to language through media. Furthermore, we reduce the vocabulary size to 32,000 tokens, aligning it with the limited vocabulary of children in the early stages of language acquisition. We use curriculum learning and is able to match the baseline on certain benchmarks while surpassing the baseline on others. Additionally, incorporating common LLM training datasets, such as MADLAD-400, degrades performance. These findings underscore the importance of dataset selection, vocabulary scaling, and curriculum learning in creating more data-efficient language models that better mimic human learning processes.

curriculum, dataset, language model, (13 more...)

2503.04611

Country: Asia > Middle East > Iran (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)