Goto

Collaborating Authors

 good solution



Average Sensitivity of Euclidean k-Clustering

Neural Information Processing Systems

Given a set of $n$ points in $\mathbb{R}^d$, the goal of Euclidean $(k,\ell)$-clustering is to find $k$ centers that minimize the sum of the $\ell$-th powers of the Euclidean distance of each point to the closest center. In practical situations, the clustering result must be stable against points missing in the input data so that we can make trustworthy and consistent decisions. To address this issue, we consider the average sensitivity of Euclidean $(k,\ell)$-clustering, which measures the stability of the output in total variation distance against deleting a random point from the input data.



Scientists prepare for the next Carrington Event

Popular Science

'Should such an event occur, there are no good solutions.' A solar flare seen by the ESA's Solar Orbiter. Breakthroughs, discoveries, and DIY tips sent every weekday. Governmental disaster preparedness isn't limited to crises that originate here on Earth. In fact, experts know that some of the most disruptive and unpredictable occurrences begin on the surface of the sun .



Extended Factorization Machine Annealing for Rapid Discovery of Transparent Conducting Materials

arXiv.org Artificial Intelligence

The development of novel transparent conducting materials (TCMs) is essential for enhancing the performance and reducing the cost of next-generation devices such as solar cells and displays. In this research, we focus on the (Al$_x$Ga$_y$In$_z$)$_2$O$_3$ system and extend the FMA framework, which combines a Factorization Machine (FM) and annealing, to search for optimal compositions and crystal structures with high accuracy and low cost. The proposed method introduces (i) the binarization of continuous variables, (ii) the utilization of good solutions using a Hopfield network, (iii) the activation of global search through adaptive random flips, and (iv) fine-tuning via a bit-string local search. Validation using the (Al$_x$Ga$_y$In$_z$)$_2$O$_3$ data from the Kaggle "Nomad2018 Predicting Transparent Conductors" competition demonstrated that our method achieves faster and more accurate searches than Bayesian optimization and genetic algorithms. Furthermore, its application to multi-objective optimization showed its capability in designing materials by simultaneously considering both the band gap and formation energy. These results suggest that applying our method to larger, more complex search problems and diverse material designs that reflect realistic experimental conditions is expected to contribute to the further advancement of materials informatics.


Average Sensitivity of Euclidean k-Clustering

Neural Information Processing Systems

Given a set of n points in \mathbb{R} d, the goal of Euclidean (k,\ell) -clustering is to find k centers that minimize the sum of the \ell -th powers of the Euclidean distance of each point to the closest center. In practical situations, the clustering result must be stable against points missing in the input data so that we can make trustworthy and consistent decisions. To address this issue, we consider the average sensitivity of Euclidean (k,\ell) -clustering, which measures the stability of the output in total variation distance against deleting a random point from the input data. We first show that a popular algorithm \textsc{ k -means } and its variant called \textsc{ D \ell -sampling} have low average sensitivity. Next, we show that any approximation algorithm for Euclidean (k,\ell) -clustering can be transformed to an algorithm with low average sensitivity while almost preserving the approximation guarantee.


Genetic algorithm --Learning from nature to solve complexe optimization problems.

#artificialintelligence

It's a method for solving both constrained and unconstrained optimization problems based on a natural selection process that mimics biological evolution. I know, it's even worse, but keep reading. Natural selection is the process by which individual organisms with favorable traits are more likely to survive and reproduce. said Charles Darwin. Also expressed as '' the survival of the fittest'', it means that if you can suit the conditions and environment you live in, then you're more likely to survive and reproduce so that your traits could be passed to next generations. Sum up: we keep individuals with particular traits that make them good for a particular task and get rid of bad ones.


Tackling Morpion Solitaire with AlphaZero-likeRanked Reward Reinforcement Learning

arXiv.org Artificial Intelligence

Morpion Solitaire is a popular single player game, performed with paper and pencil. Due to its large state space (on the order of the game of Go) traditional search algorithms, such as MCTS, have not been able to find good solutions. A later algorithm, Nested Rollout Policy Adaptation, was able to find a new record of 82 steps, albeit with large computational resources. After achieving this record, to the best of our knowledge, there has been no further progress reported, for about a decade. In this paper we take the recent impressive performance of deep self-learning reinforcement learning approaches from AlphaGo/AlphaZero as inspiration to design a searcher for Morpion Solitaire. A challenge of Morpion Solitaire is that the state space is sparse, there are few win/loss signals. Instead, we use an approach known as ranked reward to create a reinforcement learning self-play framework for Morpion Solitaire. This enables us to find medium-quality solutions with reasonable computational effort. Our record is a 67 steps solution, which is very close to the human best (68) without any other adaptation to the problem than using ranked reward. We list many further avenues for potential improvement.


High-Dimensional Robust Mean Estimation via Gradient Descent

arXiv.org Machine Learning

We study the problem of high-dimensional robust mean estimation in the presence of a constant fraction of adversarial outliers. A recent line of work has provided sophisticated polynomial-time algorithms for this problem with dimension-independent error guarantees for a range of natural distribution families. In this work, we show that a natural non-convex formulation of the problem can be solved directly by gradient descent. Our approach leverages a novel structural lemma, roughly showing that any approximate stationary point of our non-convex objective gives a near-optimal solution to the underlying robust estimation task. Our work establishes an intriguing connection between algorithmic high-dimensional robust statistics and non-convex optimization, which may have broader applications to other robust estimation tasks.