Goto

Collaborating Authors

 roberts


the MCMC perspective, we could treat these (already learned) models as proposals for the approximate MH-algorithm

Neural Information Processing Systems

We thank all of the reviewers for their valuable feedback and detailed comments. That is "improvement and justification of any implicit sampler". We know that in practice, even state-of-the-art generative models yield "unrealistic" samples, hence, are biased. (Algorithm 3). Based on our theoretical analysis, we derive different losses for the discriminator (Table 1 in the paper).


Bilevel Learning via Inexact Stochastic Gradient Descent

Salehi, Mohammad Sadegh, Mukherjee, Subhadip, Roberts, Lindon, Ehrhardt, Matthias J.

arXiv.org Artificial Intelligence

Bilevel optimization is a central tool in machine learning for high-dimensional hyperparameter tuning. Its applications are vast; for instance, in imaging it can be used for learning data-adaptive regularizers and optimizing forward operators in variational regularization. These problems are large in many ways: a lot of data is usually available to train a large number of parameters, calling for stochastic gradient-based algorithms. However, exact gradients with respect to parameters (so-called hypergradients) are not available, and their precision is usually linearly related to computational cost. Hence, algorithms must solve the problem efficiently without unnecessary precision. The design of such methods is still not fully understood, especially regarding how accuracy requirements and step size schedules affect theoretical guarantees and practical performance. Existing approaches introduce stochasticity at both the upper level (e.g., in sampling or mini-batch estimates) and the lower level (e.g., in solving the inner problem) to improve generalization, but they typically fix the number of lower-level iterations, which conflicts with asymptotic convergence assumptions. In this work, we advance the theory of inexact stochastic bilevel optimization. We prove convergence and establish rates under decaying accuracy and step size schedules, showing that with optimal configurations convergence occurs at an $\mathcal{O}(k^{-1/4})$ rate in expectation. Experiments on image denoising and inpainting with convex ridge regularizers and input-convex networks confirm our analysis: decreasing step sizes improve stability, accuracy scheduling is more critical than step size strategy, and adaptive preconditioning (e.g., Adam) further boosts performance. These results bridge theory and practice, providing convergence guarantees and practical guidance for large-scale imaging problems.


The GOP Civil War Over Nick Fuentes Has Just Begun

WIRED

Tucker Carlson's friendly interview with white nationalist Nick Fuentes has led to a major reckoning in the Republican party. Nick Fuentes, a white nationalist known for his deeply antisemitic, racist, and misogynist worldview, just might be tearing the Republican party apart. The schism was triggered last Tuesday when former Fox News host Tucker Carlson released an in-depth interview with Fuentes, the leader of the so-called America First movement who has denied the Holocaust, praised Hitler, and shared deeply misogynistic views. During the interview, Fuentes waxed antisemitic about the threat apparently posed by "organized Jewry" in America, while Carlson slammed figures like senator Ted Cruz and former president George W. Bush as being "Christian Zionists" who have been "seized by this brain virus." Carlson was criticized by, among others, US Ambassador to Israel Mike Huckabee for giving Fuentes a platform, and the argument kicked into overdrive after Kevin Roberts, president of ...


the MCMC perspective, we could treat these (already learned) models as proposals for the approximate MH-algorithm

Neural Information Processing Systems

We thank all of the reviewers for their valuable feedback and detailed comments. That is "improvement and justification of any implicit sampler". We know that in practice, even state-of-the-art generative models yield "unrealistic" samples, hence, are biased. (Algorithm 3). Based on our theoretical analysis, we derive different losses for the discriminator (Table 1 in the paper).


US Investment in Spyware Is Skyrocketing

WIRED

A new report warns that the number of US investors in powerful commercial spyware rose sharply in 2024 and names new countries linked to the dangerous technology. The United States has emerged as the largest investor in commercial spyware --a global industry that has enabled the covert surveillance of journalists, human rights defenders, politicians, diplomats, and others, posing grave threats to human rights and national security . In 2024, 20 new US-based spyware investors were identified, bringing the total number of American backers of this technology to 31. This growth has largely outpaced other major investing countries such as Israel, Italy, and the United Kingdom, according to a new report published today by the Atlantic Council. The study surveyed 561 entities across 46 countries between 1992 and 2024, identifying 34 new investors.


Comparing representations of long clinical texts for the task of patient note-identification

Alsaidi, Safa, Vincent, Marc, Boyer, Olivia, Garcelon, Nicolas, Couceiro, Miguel, Coulet, Adrien

arXiv.org Artificial Intelligence

In this paper, we address the challenge of patient-note identification, which involves accurately matching an anonymized clinical note to its corresponding patient, represented by a set of related notes. This task has broad applications, including duplicate records detection and patient similarity analysis, which require robust patient-level representations. We explore various embedding methods, including Hierarchical Attention Networks (HAN), three-level Hierarchical Transformer Networks (HTN), LongFormer, and advanced BERT-based models, focusing on their ability to process mediumto-long clinical texts effectively. Additionally, we evaluate different pooling strategies (mean, max, and mean_max) for aggregating wordlevel embeddings into patient-level representations and we examine the impact of sliding windows on model performance. Our results indicate that BERT-based embeddings outperform traditional and hierarchical models, particularly in processing lengthy clinical notes and capturing nuanced patient representations. Among the pooling strategies, mean_max pooling consistently yields the best results, highlighting its ability to capture critical features from clinical notes. Furthermore, the reproduction of our results on both MIMIC dataset and Necker hospital data warehouse illustrates the generalizability of these approaches to real-world applications, emphasizing the importance of both embedding methods and aggregation strategies in optimizing patient-note identification and enhancing patient-level modeling.


A Survey of QUD Models for Discourse Processing

Fu, Yingxue

arXiv.org Artificial Intelligence

Question Under Discussion (QUD), which is originally a linguistic analytic framework, gains increasing attention in the community of natural language processing over the years. Various models have been proposed for implementing QUD for discourse processing. This survey summarizes these models, with a focus on application to written texts, and examines studies that explore the relationship between QUD and mainstream discourse frameworks, including RST, PDTB and SDRT. Some questions that may require further study are suggested.


Billion-dollar video game: is this the most expensive piece of entertainment ever made?

The Guardian

How much does it cost to make a video game? The development expenses of blockbuster games are closely guarded business secrets, but they have been climbing ever higher over the years towards big Hollywood-style spending. Industry leaks have exposed how the budgets of major video games are spiralling upwards: 100m, or 200m, even more. One of the bestselling franchises, Call of Duty, saw costs balloon to 700m ( 573m), a number only revealed recently when a reporter dug into court filings. There is, however, one game with a budget that is anything but secret.


OpenAI Messed With the Wrong Mega-Popular Parenting Forum

WIRED

Think of any topic vaguely related to raising kids imaginable, and there's probably a post about it on Mumsnet, the long-running, enormously popular, controversy-spurring UK-based parenting forum for mothers. Over its more than two decade-long history, Mumsnet has amassed an archive of more than six billion words written by its highly engaged user base, on topics such as dirty diapers and lazy husbands. This spring, after Mumsnet discovered that AI companies were scraping its data, the company says it decided to try to strike licensing deals with some of the major players in the space, including OpenAI, which initially expressed willingness to explore an arrangement after Mumsnet first reached out. After talks with OpenAI fell apart, Mumsnet in July announced its intention to pursue legal action. According to Mumsnet, during those early conversations, an OpenAI strategic partnership lead told the company that datasets over 1 billion words were of interest to the AI giant.


The IRS Finally Has an Answer to TurboTax

The Atlantic - Technology

During the torture ritual that was doing my taxes this year, I was surprised to find myself giddy after reading these words: "You are now chatting with IRS Representative-1004671045." I had gotten stuck trying to parse my W-2, which, under "Box 14: Other," contained a mysterious 389.70 deduction from my overall pay last year. I tapped the chat button on my tax software for help, expecting to be sucked into customer-service hell. Instead, a real IRS employee answered my question in less than two minutes. The program is not TurboTax, or any one of its many competitors that will give you the white-glove treatment only after you pony up. It is Direct File, a new pilot program made by the IRS.