Rhode Island
Transfer of Deep Reactive Policies for MDP Planning
Aniket (Nick) Bajpai, Sankalp Garg, None
Domain-independent probabilistic planners input an MDP description in a factored representation language such as PPDDL or RDDL, and exploit the specifics of the representation for faster planning. Traditional algorithms operate on each problem instance independently, and good methods for transferring experience from policies of other instances of a domain to a new instance do not exist. Recently, researchers have begun exploring the use of deep reactive policies, trained via deep reinforcement learning (RL), for MDP planning domains. One advantage of deep reactive policies is that they are more amenable to transfer learning. In this paper, we present the first domain-independent transfer algorithm for MDP planning domains expressed in an RDDL representation. Our architecture exploits the symbolic state configuration and transition function of the domain (available via RDDL) to learn a shared embedding space for states and state-action pairs for all problem instances of a domain. We then learn an RL agent in the embedding space, making a near zero-shot transfer possible, i.e., without much training on the new instance, and without using the domain simulator at all. Experiments on three different benchmark domains underscore the value of our transfer algorithm. Compared against planning from scratch, and a state-of-the-art RL transfer algorithm, our transfer solution has significantly superior learning curves.
Hot methane seeps could support life beneath Antarctica's ice sheet
Microbes living beneath Antarctica's ice sheet may survive on methane generated by geothermal heat rising from deep below Earth's surface. The discovery could have implications for assessing the potential for life to survive on icy worlds beyond Earth. "These could be hotspots for microbes that are adapted to live in these areas," says Gavin Piccione at Brown University in Rhode Island. We already know that there is methane beneath Antarctica's ice sheet.
Brown University student angers non-faculty employees by asking 'what do you do all day,' faces punishment
Alex Shieh is a student at Brown University. He is making waves and facing charges for asking the school's non-faculty employees what they do all day. A sophomore at Brown University is facing the school's wrath after he sent a DOGE-like email to non-faculty employees asking them what they do all day to try to figure out why the elite school's tuition has gotten so expensive. "The inspiration for this is the rising cost of tuition," Alex Shieh told Fox News Digital in an interview. "Next year, it's set to be 93,064 to go to Brown," Shieh said of the Ivy League university.
Anthropic can now track the bizarre inner workings of a large language model
It's no secret that large language models work in mysterious ways. Few--if any--mass-market technologies have ever been so little understood. That makes figuring out what makes them tick one of the biggest open challenges in science. Shedding some light on how these models work would expose their weaknesses, revealing why they make stuff up and can be tricked into going off the rails. It would help resolve deep disputes about exactly what these models can and can't do.
Directional Pruning of Deep Neural Networks
In the light of the fact that the stochastic gradient descent (SGD) often finds a flat minimum valley in the training loss, we propose a novel directional pruning method which searches for a sparse minimizer in or close to that flat region. The proposed pruning method does not require retraining or the expert knowledge on the sparsity level.
Learning to Navigate Wikipedia by Taking Random Walks, Kenneth Marino, John Schultz
A fundamental ability of an intelligent web-based agent is seeking out and acquiring new information. Internet search engines reliably find the correct vicinity but the top results may be a few links away from the desired target. A complementary approach is navigation via hyperlinks, employing a policy that comprehends local content and selects a link that moves it closer to the target. In this paper, we show that behavioral cloning of randomly sampled trajectories is sufficient to learn an effective link selection policy. We demonstrate the approach on a graph version of Wikipedia with 38M nodes and 387M edges. The model is able to efficiently navigate between nodes 5 and 20 steps apart 96% and 92% of the time, respectively. We then use the resulting embeddings and policy in downstream fact verification and question answering tasks where, in combination with basic TF-IDF search and ranking methods, they are competitive results to the state-of-the-art methods.
Learning Object Placement Programs for Indoor Scene Synthesis with Iterative Self Training
Chang, Adrian, Wang, Kai, Li, Yuanbo, Savva, Manolis, Chang, Angel X., Ritchie, Daniel
Data driven and autoregressive indoor scene synthesis systems generate indoor scenes automatically by suggesting and then placing objects one at a time. Empirical observations show that current systems tend to produce incomplete next object location distributions. We introduce a system which addresses this problem. We design a Domain Specific Language (DSL) that specifies functional constraints. Programs from our language take as input a partial scene and object to place. Upon execution they predict possible object placements. We design a generative model which writes these programs automatically. Available 3D scene datasets do not contain programs to train on, so we build upon previous work in unsupervised program induction to introduce a new program bootstrapping algorithm. In order to quantify our empirical observations we introduce a new evaluation procedure which captures how well a system models per-object location distributions. We ask human annotators to label all the possible places an object can go in a scene and show that our system produces per-object location distributions more consistent with human annotators. Our system also generates indoor scenes of comparable quality to previous systems and while previous systems degrade in performance when training data is sparse, our system does not degrade to the same degree.
Handwritten Text Recognition: A Survey
Garrido-Munoz, Carlos, Rios-Vila, Antonio, Calvo-Zaragoza, Jorge
Handwritten Text Recognition (HTR) has become an essential field within pattern recognition and machine learning, with applications spanning historical document preservation to modern data entry and accessibility solutions. The complexity of HTR lies in the high variability of handwriting, which makes it challenging to develop robust recognition systems. This survey examines the evolution of HTR models, tracing their progression from early heuristic-based approaches to contemporary state-of-the-art neural models, which leverage deep learning techniques. The scope of the field has also expanded, with models initially capable of recognizing only word-level content progressing to recent end-to-end document-level approaches. Our paper categorizes existing work into two primary levels of recognition: (1) \emph{up to line-level}, encompassing word and line recognition, and (2) \emph{beyond line-level}, addressing paragraph- and document-level challenges. We provide a unified framework that examines research methodologies, recent advances in benchmarking, key datasets in the field, and a discussion of the results reported in the literature. Finally, we identify pressing research challenges and outline promising future directions, aiming to equip researchers and practitioners with a roadmap for advancing the field.
Cross-Encoder Rediscovers a Semantic Variant of BM25
Lu, Meng, Chen, Catherine, Eickhoff, Carsten
Neural Ranking Models (NRMs) have rapidly advanced state-of-the-art performance on information retrieval tasks. In this work, we investigate a Cross-Encoder variant of MiniLM to determine which relevance features it computes and where they are stored. We find that it employs a semantic variant of the traditional BM25 in an interpretable manner, featuring localized components: (1) Transformer attention heads that compute soft term frequency while controlling for term saturation and document length effects, and (2) a low-rank component of its embedding matrix that encodes inverse document frequency information for the vocabulary. This suggests that the Cross-Encoder uses the same fundamental mechanisms as BM25, but further leverages their capacity to capture semantics for improved retrieval performance. The granular understanding lays the groundwork for model editing to enhance model transparency, addressing safety concerns, and improving scalability in training and real-world applications.