Goto

Collaborating Authors

 wahba


Grace Wahba awarded the 2025 International Prize in Statistics

AIHub

The International Prize in Statistics Foundation has awarded Grace Wahba the 2025 prize for "her groundbreaking work on smoothing splines, which has transformed data analysis and machine learning". Professor Wahba was among the earliest to pioneer the use of nonparametric regression modeling. Recent advances in computing and availability of large data sets have further popularized these models, especially under the guise of machine learning algorithms such as gradient boosting and neural networks. Nevertheless, the use of smoothing splines remains a mainstay of nonparametric regression. In seminal research that began in the early 1970s, Wahba developed theoretical foundations and computational algorithms for fitting smoothing splines to noisy data.


Special Unitary Parameterized Estimators of Rotation

Chandrasekhar, Akshay

arXiv.org Artificial Intelligence

This paper explores rotation estimation from the perspective of special unitary matrices. First, multiple solutions to Wahba's problem are derived through special unitary matrices, providing linear constraints on quaternion rotation parameters. Next, from these constraints, closed-form solutions to the problem are presented for minimal cases. Finally, motivated by these results, we investigate new representations for learning rotations in neural networks. Numerous experiments validate the proposed methods.


Cities could face 100 million 'new poor' in post-pandemic world

The Japan Times

BOGOTA – About 100 million people living in cities worldwide will likely fall into poverty due to the coronavirus pandemic, urban experts said on Wednesday, calling for mapping tools to identify vulnerable communities and investment focusing on slums. Densely populated cities are at the front line of the contagious outbreak. People living in poverty with little or no running water, sewage systems or health care access have been hit especially hard, said experts at the World Bank, the World Resources Institute (WRI) and other groups studying urban issues. "Within cities we need to focus on those who need help the most, the poor and the vulnerable have been very seriously affected," said Sameh Wahba, global director for the World Bank's urban, disaster risk management, resilience and land global practice. "Our estimate is that there will be possibly upward of a 100 million so-called new poor on account of loses of jobs and livelihoods and income," Wahba told a webinar with members of the media.


Optimal tuning for divide-and-conquer kernel ridge regression with massive data

Xu, Ganggang, Shang, Zuofeng, Cheng, Guang

arXiv.org Machine Learning

We propose a first data-driven tuning procedure for divide-and-conquer kernel ridge regression (Zhang et al., 2015). While the proposed criterion is computationally scalable for massive data sets, it is also shown to be asymptotically optimal under mild conditions. The effectiveness of our method is illustrated by extensive simulations and an application to Million Song Dataset. Some key words:Distributed GCV, divide-and-conquer, kernel ridge regression, optimal tuning.


Graph Matching for Shape Retrieval

Huet, Benoit, Cross, Andrew D. J., Hancock, Edwin R.

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


The Bias-Variance Tradeoff and the Randomized GACV

Wahba, Grace, Lin, Xiwu, Gao, Fangyu, Xiang, Dong, Klein, Ronald, Klein, Barbara

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


Graph Matching for Shape Retrieval

Huet, Benoit, Cross, Andrew D. J., Hancock, Edwin R.

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


The Bias-Variance Tradeoff and the Randomized GACV

Wahba, Grace, Lin, Xiwu, Gao, Fangyu, Xiang, Dong, Klein, Ronald, Klein, Barbara

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


Graph Matching for Shape Retrieval

Huet, Benoit, Cross, Andrew D. J., Hancock, Edwin R.

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refersto a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution,representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.


The Bias-Variance Tradeoff and the Randomized GACV

Wahba, Grace, Lin, Xiwu, Gao, Fangyu, Xiang, Dong, Klein, Ronald, Klein, Barbara

Neural Information Processing Systems

We propose a new in-sample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the bias-variance or fit-complexity tradeoff in'soft' classification. Soft classification refersto a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class O. The target for optimizing the the tradeoff is the Kullback-Liebler distance between the estimated probability distribution and the'true' probability distribution,representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data.