Goto

Collaborating Authors

 hamlet


HAMLET: Switch your Vision-Language-Action Model into a History-Aware Policy

arXiv.org Artificial Intelligence

Inherently, robotic manipulation tasks are history-dependent: leveraging past context could be beneficial. However, most existing Vision-Language-Action models (VLAs) have been designed without considering this aspect, i.e., they rely solely on the current observation, ignoring preceding context. In this paper, we propose HAMLET, a scalable framework to adapt VLAs to attend to the historical context during action prediction. Specifically, we introduce moment tokens that compactly encode perceptual information at each timestep. Their representations are initialized with time-contrastive learning, allowing them to better capture temporally distinctive aspects. Next, we employ a lightweight memory module that integrates the moment tokens across past timesteps into memory features, which are then leveraged for action prediction. Through empirical evaluation, we show that HAMLET successfully transforms a state-of-the-art VLA into a history-aware policy, especially demonstrating significant improvements on long-horizon tasks that require historical context. In particular, on top of GR00T N1.5, HAMLET achieves an average success rate of 76.4% on history-dependent real-world tasks, surpassing the baseline performance by 47.2%. Furthermore, HAMLET pushes prior art performance from 64.1% to 66.4% on RoboCasa Kitchen (100-demo setup) and from 95.6% to 97.7% on LIBERO, highlighting its effectiveness even under generic robot-manipulation benchmarks.


Topic Identification in LLM Input-Output Pairs through the Lens of Information Bottleneck

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are prone to critical failure modes, including \textit{intrinsic faithfulness hallucinations} (also known as confabulations), where a response deviates semantically from the provided context. Frameworks designed to detect this, such as Semantic Divergence Metrics (SDM), rely on identifying latent topics shared between prompts and responses, typically by applying geometric clustering to their sentence embeddings. This creates a disconnect, as the topics are optimized for spatial proximity, not for the downstream information-theoretic analysis. In this paper, we bridge this gap by developing a principled topic identification method grounded in the Deterministic Information Bottleneck (DIB) for geometric clustering. Our key contribution is to transform the DIB method into a practical algorithm for high-dimensional data by substituting its intractable KL divergence term with a computationally efficient upper bound. The resulting method, which we dub UDIB, can be interpreted as an entropy-regularized and robustified version of K-means that inherently favors a parsimonious number of informative clusters. By applying UDIB to the joint clustering of LLM prompt and response embeddings, we generate a shared topic representation that is not merely spatially coherent but is fundamentally structured to be maximally informative about the prompt-response relationship. This provides a superior foundation for the SDM framework and offers a novel, more sensitive tool for detecting confabulations.


Towards a Holistic and Automated Evaluation Framework for Multi-Level Comprehension of LLMs in Book-Length Contexts

arXiv.org Artificial Intelligence

We introduce HAMLET, a holistic and automated framework for evaluating the long-context comprehension of large language models (LLMs). HAMLET structures source texts into a three-level key-fact hierarchy at root-, branch-, and leaf-levels, and employs query-focused summarization to evaluate how well models recall and faithfully represent information at each level. To validate the reliability of our fully automated pipeline, we conduct a systematic human study, showing that our automatic evaluation achieves over 90% agreement with expert human judgments, while reducing the cost by up to 25 times. HAMLET reveals that LLMs struggle with fine-grained comprehension, especially at the leaf level, and are sensitive to positional effects like the lost-in-the-middle. Analytical queries pose greater challenges than narrative ones, and consistent performance gaps emerge between open-source and proprietary models, as well as across model scales. Our code and dataset are publicly available at https://github.com/DISL-Lab/HAMLET.


Violent and lewd! Not Grand Theft Auto, Shakespeare's Macbeth

The Guardian

Last week, the Guardian spoke to the team behind Lili, a video game retelling of Macbeth, shown at the Cannes film festival. The headline quote from the piece was "Shakespeare would be writing for games today", which I have heard many times, and does make a lot of sense. Shakespeare worked in the Elizabethan theatre, a period in which plays were considered popularist entertainment hardly worthy of analysis or preservation โ€“ just like video games today! The authorities were also concerned about the lewd and violent nature of plays and the effect they may have on the impressionable masses โ€“ ditto! But if we agree that a 21st-century Shakespeare would be making games, what sort would he be making?


HAMLET: Healthcare-focused Adaptive Multilingual Learning Embedding-based Topic Modeling

arXiv.org Artificial Intelligence

Traditional topic models often struggle with contextual nuances and fail to adequately handle polysemy and rare words. This limitation typically results in topics that lack coherence and quality. Large Language Models (LLMs) can mitigate this issue by generating an initial set of topics. However, these raw topics frequently lack refinement and representativeness, which leads to redundancy without lexical similarity and reduced interpretability. This paper introduces HAMLET, a graph-driven architecture for cross-lingual healthcare topic modeling that uses LLMs. The proposed approach leverages neural-enhanced semantic fusion to refine the embeddings of topics generated by the LLM. Instead of relying solely on statistical co-occurrence or human interpretation to extract topics from a document corpus, this method introduces a topic embedding refinement that uses Bidirectional Encoder Representations from Transformers (BERT) and Graph Neural Networks (GNN). After topic generation, a hybrid technique that involves BERT and Sentence-BERT (SBERT) is employed for embedding. The topic representations are further refined using a GNN, which establishes connections between documents, topics, words, similar topics, and similar words. A novel method is introduced to compute similarities. Consequently, the topic embeddings are refined, and the top k topics are extracted. Experiments were conducted using two healthcare datasets, one in English and one in French, from which six sets were derived. The results demonstrate the effectiveness of HAMLET.


How Hamlet found a virtual stage in Grand Theft Auto

BBC News

Young cast member Nora has benefited from this opportunity. She openly thanks those in game for giving her the opportunity to act and express herself freely, particularly as someone going through a gender transition. "It's amazing that her first production experience of Shakespeare, beyond studying in school, was in Grand Theft Auto," Grylls says. "That's what kept us going really, the fact people kept coming back because they wanted to." Grylls, Crane and Oosterveen's committed madness has paid off.


The Morning After: This is Tesla's robotaxi, the Cybercab

Engadget

At Tesla's We, Robot event at Warner Bros. Discovery's studio in California, the company finally unveiled its robotaxi. The car is expected to go into production before 2027, but even Musk caveated that, saying he was "highly optimistic with timeframes." The Cybercab doesn't have a steering wheel and, according to Elon Musk (so pinch of salt!), could be very cheap to run. The Tesla boss said the operating cost of the robotaxi would be 20 cents a mile, 30 to 40 cents with taxes. He also confirmed people can buy one and that Tesla expects to sell the Cybercab for below 30,000.


Mash-up of Grand Theft Auto and Hamlet is coming to theaters in the US

Engadget

Mubi has secured the US rights and global SVOD rights to Grand Theft Hamlet. In this documentary, two out-of-work actors attempt to stage an entire production of William Shakespeare's tragedy Hamlet within the game world of Grand Theft Auto Online during the Covid-19 pandemic. According to The Hollywood Reporter, Mubi plans to give the film a release in early 2025, and Mubi's own posts on X say that it will be in "US theaters and streaming globally." The movie is composed of more than 300 hours of GTA footage. Sam Crane and Mark Oosterveen might be the main drivers of making the play the thing, but they looped in other random players through in-game auditions to fill out the cast.


Royal Reveals: LiDAR Mapping of Kronborg Castle, Echoes of Hamlet's Halls

arXiv.org Artificial Intelligence

This paper presents a large scale dataset from a meticulous 360-degree LiDAR (Light Detection and Ranging) scan conducted on Kronborg Castle, a renowned Renaissance fortress located in Elsinore (Helsing{\o}r), Denmark, famously associated with Shakespeare's "Hamlet." Utilising a vertical mounted, gimbal stabilised, 16 channel, 360-degree Velodyne VLP-16 LiDAR scanner, paired with an Intel RealSense L515 depth camera. This research offers an unparalleled digital representation of the castle's intricate architectural details and structural nuances, enabling fellow researchers to conduct experiments utilising the data for SLAM (Simultaneous Localisation and Mapping) as well as floorplan generation.


BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model

arXiv.org Artificial Intelligence

The rapid advancement of large language models (LLMs) has revolutionized role-playing, enabling the development of general role-playing models. However, current role-playing training has two significant issues: (I) Using a predefined role profile to prompt dialogue training for specific scenarios usually leads to inconsistencies and even conflicts between the dialogue and the profile, resulting in training biases. (II) The model learns to imitate the role based solely on the profile, neglecting profile-dialogue alignment at the sentence level. In this work, we propose a simple yet effective framework called BEYOND DIALOGUE, designed to overcome these hurdles. This framework innovatively introduces "beyond dialogue" tasks to align dialogue with profile traits based on each specific scenario, thereby eliminating biases during training. Furthermore, by adopting an innovative prompting mechanism that generates reasoning outcomes for training, the framework allows the model to achieve fine-grained alignment between profile and dialogue at the sentence level. The aforementioned methods are fully automated and low-cost. Additionally, the integration of automated dialogue and objective evaluation methods forms a comprehensive framework, paving the way for general role-playing. Experimental results demonstrate that our model excels in adhering to and reflecting various dimensions of role profiles, outperforming most proprietary general and specialized role-playing baselines. All code and datasets are available at https://github.com/yuyouyu32/BeyondDialogue.