Media
Preventing Gradient Explosions in Gated Recurrent Units
A gated recurrent unit (GRU) is a successful recurrent neural network architecture for time-series data. The GRU is typically trained using a gradient-based method, which is subject to the exploding gradient problem in which the gradient increases significantly. This problem is caused by an abrupt change in the dynamics of the GRU due to a small variation in the parameters. In this paper, we find a condition under which the dynamics of the GRU changes drastically and propose a learning method to address the exploding gradient problem. Our method constrains the dynamics of the GRU so that it does not drastically change. We evaluated our method in experiments on language modeling and polyphonic music modeling. Our experiments showed that our method can prevent the exploding gradient problem and improve modeling accuracy.
Interactive Submodular Bandit
In many machine learning applications, submodular functions have been used as a model for evaluating the utility or payoff of a set such as news items to recommend, sensors to deploy in a terrain, nodes to influence in a social network, to name a few. At the heart of all these applications is the assumption that the underlying utility/payoff function is known a priori, hence maximizing it is in principle possible. In real life situations, however, the utility function is not fully known in advance and can only be estimated via interactions. For instance, whether a user likes a movie or not can be reliably evaluated only after it was shown to her. Or, the range of influence of a user in a social network can be estimated only after she is selected to advertise the product.
The Human Skill That Eludes AI
Why can't language models write well? I n a certain, strange way, generative AI peaked with OpenAI's GPT-2 seven years ago. Little known to anyone outside of tech circles, GPT-2 excelled at producing unexpected answers. "You could be like, 'Continue this story:,' and GPT-2 would be like, ','" Katy Gero, a poet and computer scientist who has been experimenting with language models since 2017, told me. "The models won't do that anymore." AI leaders boast about their models' superhuman technical abilities.
Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering
We introduce the framework of {\em blind regression} motivated by {\em matrix completion} for recommendation systems: given $m$ users, $n$ movies, and a subset of user-movie ratings, the goal is to predict the unobserved user-movie ratings given the data, i.e., to complete the partially observed matrix. Following the framework of non-parametric statistics, we posit that user $u$ and movie $i$ have features $x_1(u)$ and $x_2(i)$ respectively, and their corresponding rating $y(u,i)$ is a noisy measurement of $f(x_1(u), x_2(i))$ for some unknown function $f$. In contrast with classical regression, the features $x = (x_1(u), x_2(i))$ are not observed, making it challenging to apply standard regression methods to predict the unobserved ratings. Inspired by the classical Taylor's expansion for differentiable functions, we provide a prediction algorithm that is consistent for all Lipschitz functions. In fact, the analysis through our framework naturally leads to a variant of collaborative filtering, shedding insight into the widespread success of collaborative filtering in practice. Assuming each entry is sampled independently with probability at least $\max(m^{-1+\delta},n^{-1/2+\delta})$ with $\delta > 0$, we prove that the expected fraction of our estimates with error greater than $\epsilon$ is less than $\gamma^2 / \epsilon^2$ plus a polynomially decaying term, where $\gamma^2$ is the variance of the additive entry-wise noise term. Experiments with the MovieLens and Netflix datasets suggest that our algorithm provides principled improvements over basic collaborative filtering and is competitive with matrix factorization methods.
A photo of Iran's bombed schoolgirl graveyard went around the world. Was it real, or AI?
Graves being prepared for the victims of an airstrike on a school in Minab in southern Iran, 2 March 2026. Graves being prepared for the victims of an airstrike on a school in Minab in southern Iran, 2 March 2026. A photo of Iran's bombed schoolgirl graveyard went around the world. T he graves, freshly dug, lie in neat rows of 20 across. More than 60 have already been carved out of the earth, with a few clusters of people standing gathered around them.