Goto

Collaborating Authors

 Ontario


Deep Neural Networks for Doubly Robust Estimation with Nonprobability Survey Samples

arXiv.org Machine Learning

Integrating probability and nonprobability survey samples is an important problem in modern survey sampling. Nonprobability samples often contain rich outcome information but may lack population representativeness, whereas probability samples provide design-based auxiliary information but may not contain the study variable. We propose a deep neural network (DNN)-assisted doubly robust framework for estimating the finite population mean from these two data sources. The proposed method models the logit sampling score for the nonprobability sample as an unknown nonparametric function and estimates it by maximizing a pseudo-likelihood that combines information from the nonprobability sample and a reference probability sample. The DNN parameters are optimized using the ADAM algorithm. The resulting DNN-estimated sampling scores are incorporated into a DNN-assisted inverse-probability weighted estimator and a deep doubly robust estimator. We establish consistency and convergence rates under regularity conditions and evaluate the finite-sample performance of the proposed estimators through simulation studies and an empirical application using Pew Research Center and Behavioral Risk Factor Surveillance System data. The results suggest that the proposed estimators can improve robustness to parametric propensity-score misspecification, especially when the true selection mechanism is nonlinear.


Causal Risk Minimization for High-Dimensional Treatments

arXiv.org Machine Learning

Predicting the effect of interventions with many possible variations, e.g., therapeutic content that affects mental health outcomes or an earnings call transcript that drives movement in share price, is useful across several domains. However, classical causal estimators tend to assume that all possible interventions are observed, which is infeasible when interventions vary widely, for instance, in the space of all text strings. We adapt a well-known approach of recasting causal inference as a learning problem, to address high-dimensional treatment spaces. Specifically, under standard assumptions like no unobserved confounding, we show that causal error decomposes into a series of moment-balancing errors of increasing order, and design objectives that directly improve causal estimation. We also show how to project the effect of a high-dimensional treatment onto lower-dimensional treatment attributes, which allows a single model to answer several causal questions without additional attribute-specific training. We empirically evaluate our estimators in settings with high-dimensional continuous, discrete, and text treatments, the last of which used a semi-synthetic dataset of Amazon Reviews. Our experiments demonstrate the benefit of higher-order balance error optimization and competitive performance of projected causal estimates with attribute-specific estimators.


Paul Anka tells Bill Maher crime has gone 'through the roof' in Canada amid recent immigration

FOX News

Paul Anka says Toronto's crime rate has spiked amid the arrival over 400,000 new immigrants, telling Bill Maher that Canada was homogenous until recently.


On London's streets, facial recognition tests the balance between security and liberty

The Japan Times

On London's streets, facial recognition tests the balance between security and liberty Temporary street signs warn pedestrians of a Metropolitan Police live facial recognition operation in London on May 11. | REUTERS London - Tourists, shoppers and office workers on a busy London street on an ordinary weekday found themselves part of a digital identity check as live facial recognition cameras scanned faces against a police watchlist. The operation was an example of a technology the Metropolitan Police say is transforming policing, helping officers arrest around 2,500 wanted people since the start of 2024, including suspects accused of violent and sexual offences. Critics, however, say live facial recognition undermines the presumption of innocence underpinning British law by treating every passerby as a potential suspect. In a time of both misinformation and too much information, quality journalism is more crucial than ever. By subscribing, you can help us get the story right.


SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference

arXiv.org Machine Learning

Survival analysis provides a powerful statistical framework for modeling time-to-event outcomes in the presence of censoring. However, selecting an appropriate estimator from the many specialized survival approaches often requires substantial methodological and domain expertise. We introduce SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for censored observations through in-context learning. SurvivalPFN is pretrained on a diverse family of synthetic, identifiable, and right-censored data-generating processes, enabling it to amortize survival analysis in a single forward pass during inference. As a result, the model adapts to the effective complexity of each dataset without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. In a large-scale benchmark spanning 61 datasets, 21 methods, and 5 evaluation metrics, SurvivalPFN achieves strong predictive performance and often improves upon established survival models. These results suggest that SurvivalPFN offers a principled and practical foundation model for survival analysis, with potential applications in high-impact domains such as healthcare, finance, and engineering (https://github.com/rgklab/SurvivalPFN).


I'm an exorcist... Here's the chilling encounter that made me believe UFOs are the work of Satan

Daily Mail - Science & tech

No one wants to hang out with her': Why Meghan and Harry have been ditched by A-list friends as insiders reveal Oprah's merciless snub, why the Clooneys now want nothing to do with them - and how SHE'S the problem Truth about Kate Middleton's past before Prince William... we Americans see this for what it is: KENNEDY I was on track to make $1 million... then I quit my job and moved into an off-grid tiny home with no running water or electricity Professional tasters decide best and worst fast food cheeseburger - do you agree? Hamptons cancer cluster: Rates are spiking in summer enclave of New York's wealthy elite... and doctors think they know the tragic reason why Disturbing trove of images woke Los Angeles mayor Karen Bass doesn't want you to see: Filthy truth is so much worse than people think... Comedian replacing Stephen Colbert appears to take a swipe at his predecessor as he vows to avoid politics and'just be funny' Jordon Hudson blasts double standards over Mike Vrabel and Dianna Russini'affair' scandal: 'What is going on?' I've discovered my'nice' neighbor's dirty secret... what I caught him doing one night was far more disturbing than I realized: DEAR JANE SNL season finale cold open sees ghost of Jeffrey Epstein played by Will Ferrell'haunt' Trump as dark jokes leave viewers shocked Golf star becomes instant fan favorite after stopping to smoke a cigarette with crowd in the middle of the PGA Championship: 'Man of the people' New kind of penis enlargement surgery will add inches, claims the doctor set to offer it... but there is a gruesome detail that may make some think twice I saw a 40-year-old middle-class mom in a psychiatric ward after a single hit of this drug. Her symptoms were terrifying but it's so common now... here's what you must know: DR MAX PEMBERTON US airman returns home to every dog owner's nightmare as he finds his husky DEAD in garage after trusting pet sitter to take care of his beloved companion I'm an exorcist... Here's the chilling encounter that made me believe UFOs are the work of Satan A Catholic priest and exorcist who has encountered demons first-hand has revealed a chilling experience that led him to believe UFOs are actually the work of the devil. Father Carlos Martins, an Ontario-born priest who has performed exorcisms around the world, believes the UFO phenomenon is part of a larger spiritual deception designed to undermine Christianity and cast doubt on the Bible . The priest said a longtime friend who later converted to Christianity once witnessed a gigantic spacecraft hovering silently over a suburban park before it shot away'instantly to the speed of a bullet.'


The Elon Musk v Sam Altman battle is a distraction Karen Hao

The Guardian

'If OpenAI lost its footing as the AI industry frontrunner, another barely distinguishable competitor - Musk's xAI or other - would simply replace it.' 'If OpenAI lost its footing as the AI industry frontrunner, another barely distinguishable competitor - Musk's xAI or other - would simply replace it.' If it wasn't already clear, Elon Musk and Sam Altman hate each other. While the two men were once cofounders of OpenAI, they're now locked in a vicious feud, playing out in all its theatrics in front of a judge and jury in a California courtroom. Musk is suing, alleging that Altman and OpenAI president Greg Brockman tricked him into forming and funding the organization as a non-profit before they subsequently restructured it to have a for-profit entity.


When Can Digital Personas Reliably Approximate Human Survey Findings?

arXiv.org Machine Learning

Digital personas powered by Large Language Models (LLMs) are increasingly proposed as substitutes for human survey respondents, yet it remains unclear when they can reliably approximate human survey findings. We answer this question using the LISS panel, constructing personas from respondents' background variables and pre-2023 survey histories, then testing them against the same respondents' held-out post-cutoff answers. Across four persona architectures, three LLMs, and two prediction tasks, we assess performance at the question, respondent, distributional, equity, and clustering levels. Digital personas improve alignment with human response distributions, especially in domains tied to stable attributes and values, but remain limited for individual prediction and fail to recover multivariate respondent structure. Retrieval-augmented architectures provide the clearest gains, but performance depends more on human response structure than on model choice: personas perform best for low-variability questions and common respondent patterns, and worst for subjective, heterogeneous, or rare responses. Our results provide practical guidance on when digital personas could be appropriate for survey research and when human validation remains necessary.


Robust and Fast Training via Per-Sample Clipping

arXiv.org Machine Learning

We propose a robust gradient estimator based on per-sample gradient clipping and analyze its properties both theoretically and empirically. We show that the resulting method, per-sample clipped SGD (PS-Clip-SGD), achieves optimal in-expectation convergence rates for non-convex optimization problems under heavy-tailed gradient noise. Moreover, we establish high-probability convergence guarantees that match the in-expectation rates up to polylogarithmic factors in the failure probability. We complement our theoretical results with multiple numerical experiments. In particular, we demonstrate that PS-Clip-SGD outperforms both vanilla SGD with momentum and standard gradient clipping when training AlexNet on the CIFAR-100 dataset, even after accounting for the additional computational time caused by per-sample clipping. We also empirically show that, in the presence of gradient accumulation, applying clipping at the mini-batch level can improve training performance while incurring virtually no additional computational cost. This finding is particularly interesting, as it contradicts the common practice of applying clipping only after all accumulation steps have been completed.