curiosity
Commodore 64 Ultimate Review: An Astonishing Remake
The reborn Commodore 64 is an astonishing remake--but daunting if you weren't there the first time around. "Digital detox" approach is compelling. It's hard to overstate just how seismic an impact the Commodore 64 had on home computing. Launched in 1982, the 8-bit machine--iconic in its beige plastic shell with integrated keyboard--went on to become the best-selling personal computer of all time . Despite the success, manufacturer Commodore International folded in 1994, with rights to the name floating around for years.
- North America > United States > California (0.04)
- Europe > Slovakia (0.04)
- Europe > Czechia (0.04)
Curriculum-guided Hindsight Experience Replay
In off-policy deep reinforcement learning, it is usually hard to collect sufficient successful experiences with sparse rewards to learn from. Hindsight experience replay (HER) enables an agent to learn from failures by treating the achieved state of a failed experience as a pseudo goal. However, not all the failed experiences are equally useful to different learning stages, so it is not efficient to replay all of them or uniform samples of them. In this paper, we propose to 1) adaptively select the failed experiences for replay according to the proximity to the true goals and the curiosity of exploration over diverse pseudo goals, and 2) gradually change the proportion of the goal-proximity and the diversity-based curiosity in the selection criteria: we adopt a human-like learning strategy that enforces more curiosity in earlier stages and changes to larger goal-proximity later. This Curriculum-guided HER (CHER)'', which adaptively and dynamically controls the exploration-exploitation trade-off during the learning process via hindsight experience selection. We show that CHER improves the state of the art in challenging robotics environments.
Exploring through Random Curiosity with General Value Functions
Efficient exploration in reinforcement learning is a challenging problem commonly addressed through intrinsic rewards. Recent prominent approaches are based on state novelty or variants of artificial curiosity. However, directly applying them to partially observable environments can be ineffective and lead to premature dissipation of intrinsic rewards. Here we propose random curiosity with general value functions (RC-GVF), a novel intrinsic reward function that draws upon connections between these distinct approaches. Instead of using only the current observation's novelty or a curiosity bonus for failing to predict precise environment dynamics, RC-GVF derives intrinsic rewards through predicting temporally extended general value functions. We demonstrate that this improves exploration in a hard-exploration diabolical lock problem. Furthermore, RC-GVF significantly outperforms previous methods in the absence of ground-truth episodic counts in the partially observable MiniGrid environments. Panoramic observations on MiniGrid further boost RC-GVF's performance such that it is competitive to baselines exploiting privileged information in form of episodic counts.
- North America > Canada (0.04)
- Europe > Italy (0.04)
A Supplemental Details
Here, we formally define all intrinsic rewards evaluated in the paper. All algorithms use the same network architectures. Observations and changes counts are based on egocentric and panoramic views, respectively. MiniGrid, egocentric views are 147-dimensional while panoramic views are 588-dimensional. Gradients are clipped to have maximum norm 40.
A Code
Input preprocessing We convert all images to grayscale and resize to 84x84. It is a convolutional neural network with fixed random weights. In Atari, we use 128 parallel environments, and in Habitat, we use 1 environment, as it does not support multithreading. We use the same hyperparameters as in large scale curiosity: a learning rate of 0.0001 for all models, a discount factor Future prediction and multimodal association can be complementary forms of curiosity. Further work could explore other ways of combining intrinsic rewards, such as switching between the complementary forms.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Nebraska (0.04)
- North America > Canada (0.04)
Generative Medical Event Models Improve with Scale
Waxler, Shane, Blazek, Paul, White, Davis, Sneider, Daniel, Chung, Kevin, Nagarathnam, Mani, Williams, Patrick, Voeller, Hank, Wong, Karen, Swanhorst, Matthew, Zhang, Sheng, Usuyama, Naoto, Wong, Cliff, Naumann, Tristan, Poon, Hoifung, Loza, Andrew, Meeker, Daniella, Hain, Seth, Shah, Rahul
Realizing personalized medicine at scale calls for methods that distill insights from longitudinal patient journeys, which can be viewed as a sequence of medical events. Foundation models pretrained on large-scale medical event data represent a promising direction for scaling real-world evidence generation and generalizing to diverse downstream tasks. Using Epic Cosmos, a dataset with medical events from de-identified longitudinal health records for 16.3 billion encounters over 300 million unique patient records from 310 health systems, we introduce the Curiosity models, a family of decoder-only transformer models pretrained on 118 million patients representing 115 billion discrete medical events (151 billion tokens). We present the largest scaling-law study of medical event data, establishing a methodology for pretraining and revealing power-law scaling relationships for compute, tokens, and model size. Consequently, we pretrained a series of compute-optimal models with up to 1 billion parameters. Conditioned on a patient's real-world history, Curiosity autoregressively predicts the next medical event to simulate patient health timelines. We studied 78 real-world tasks, including diagnosis prediction, disease prognosis, and healthcare operations. Remarkably for a foundation model with generic pretraining and simulation-based inference, Curiosity generally outperformed or matched task-specific supervised models on these tasks, without requiring task-specific fine-tuning or few-shot examples. Curiosity's predictive power consistently improves as the model and pretraining scale. Our results show that Curiosity, a generative medical event foundation model, can effectively capture complex clinical dynamics, providing an extensible and generalizable framework to support clinical decision-making, streamline healthcare operations, and improve patient outcomes.
- North America > United States > Alaska (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Saudi Arabia (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Rheumatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- (17 more...)
Why Did Apple Fall To The Ground: Evaluating Curiosity In Large Language Model
Wang, Haoyu, Jiang, Sihang, Chen, Yuyan, Wang, Yitong, Xiao, Yanghua
Curiosity serves as a pivotal conduit for human beings to discover and learn new knowledge. Recent advancements of large language models (LLMs) in natural language processing have sparked discussions regarding whether these models possess capability of curiosity-driven learning akin to humans. In this paper, starting from the human curiosity assessment questionnaire Five-Dimensional Curiosity scale Revised (5DCR), we design a comprehensive evaluation framework that covers dimensions such as Information Seeking, Thrill Seeking, and Social Curiosity to assess the extent of curiosity exhibited by LLMs. The results demonstrate that LLMs exhibit a stronger thirst for knowledge than humans but still tend to make conservative choices when faced with uncertain environments. We further investigated the relationship between curiosity and thinking of LLMs, confirming that curious behaviors can enhance the model's reasoning and active learning abilities. These findings suggest that LLMs have the potential to exhibit curiosity similar to that of humans, providing experimental support for the future development of learning capabilities and innovative research in LLMs.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Middle East > Jordan (0.04)