rts
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Colorado > Denver County > Denver (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
Rank-Then-Score: Enhancing Large Language Models for Automated Essay Scoring
Cai, Yida, Liang, Kun, Lee, Sanwoo, Wang, Qinghan, Wu, Yunfang
In recent years, large language models (LLMs) achieve remarkable success across a variety of tasks. However, their potential in the domain of Automated Essay Scoring (AES) remains largely underexplored. Moreover, compared to English data, the methods for Chinese AES is not well developed. In this paper, we propose Rank-Then-Score (RTS), a fine-tuning framework based on large language models to enhance their essay scoring capabilities. Specifically, we fine-tune the ranking model (Ranker) with feature-enriched data, and then feed the output of the ranking model, in the form of a candidate score set, with the essay content into the scoring model (Scorer) to produce the final score. Experimental results on two benchmark datasets, HSK and ASAP, demonstrate that RTS consistently outperforms the direct prompting (Vanilla) method in terms of average QWK across all LLMs and datasets, and achieves the best performance on Chinese essay scoring using the HSK dataset.
RTify: Aligning Deep Neural Networks with Human Behavioral Decisions
Cheng, Yu-Ang, Rodriguez, Ivan Felipe, Chen, Sixuan, Kar, Kohitij, Watanabe, Takeo, Serre, Thomas
Here, we introduce a novel computational framework to model the dynamics of human behavioral choices by learning to align the temporal dynamics of a recurrent neural network (RNN) to human reaction times (RTs). We describe an approximation that allows us to constrain the number of time steps an RNN takes to solve a task with human RTs. The approach is extensively evaluated against various psychophysics experiments. We also show that the approximation can be used to optimize an "ideal-observer" RNN model to achieve an optimal tradeoff between speed and accuracy without human data. The resulting model is found to account well for human RT data. Finally, we use the approximation to train a deep learning implementation of the popular Wong-Wang decision-making model. The model is integrated with a convolutional neural network (CNN) model of visual processing and evaluated using both artificial and natural image stimuli. Overall, we present a novel framework that helps align current vision models with human behavior, bringing us closer to an integrated model of human vision.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Colorado > Denver County > Denver (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
I wish Blizzard loved Warcraft as much as I do
Blizzard's first real-time strategy games had a profound impact on me as a young immigrant to Canada in 1994 and '95. Warcraft: Orcs & Humans and Warcraft II: Tides of Darkness helped me learn how to read and write in English, and formed the basis for some of my oldest friendships in a brand-new country. Suffice to say, I have a lot of love for these old RTS games -- maybe more than Blizzard itself. So you can imagine my excitement at remaster rumors for Warcraft II and its expansion, Beyond the Dark Portal. When Blizzard aired its Warcraft Direct last week, not only were those rumors confirmed, but it announced that the original Warcraft would receive the same treatment, and both would be sold alongside Warcraft III: Reforged (itself a remaster) as part of a new battle chest.
Reviews: ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
The main proposal of the paper is a real-time strategy simulator specifically designed for reinforcement learning purposes. The paper presents with several details the architecture of the simulator, along with how gaming is done on it and some experimentations with the software with some RL techniques implemented in the software. Although I think there are good values in making with software for research, I don't think that NIPS is the right forum for presenting technical papers on them. Machine Learning Open Source Software (MLOSS) track from JMLR or relevant workshop are much relevant for that. And in the current case, a publication in the IEEE Computational Intelligence and Games (IEEE-CIG) conference might be a much better fit.
Enhancing Biomedical Knowledge Retrieval-Augmented Generation with Self-Rewarding Tree Search and Proximal Policy Optimization
Hu, Minda, Zong, Licheng, Wang, Hongru, Zhou, Jingyan, Li, Jingjing, Gao, Yichen, Wong, Kam-Fai, Li, Yu, King, Irwin
Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG). However, existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries, resulting in sub-optimal performance. To address these limitations, we propose a novel plug-and-play LLM-based retrieval method called Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm. By combining the reasoning capabilities of LLMs with the effectiveness of tree search, SeRTS boosts the zero-shot performance of retrieving high-quality and informative results for RAG. We further enhance retrieval performance by fine-tuning LLMs with Proximal Policy Optimization (PPO) objectives using the trajectories collected by SeRTS as feedback. Controlled experiments using the BioASQ-QA dataset with GPT-3.5-Turbo and LLama2-7b demonstrate that our method significantly improves the performance of the BM25 retriever and surpasses the strong baseline of self-reflection in both efficiency and scalability. Moreover, SeRTS generates higher-quality feedback for PPO training than self-reflection. Our proposed method effectively adapts LLMs to document retrieval tasks, enhancing their ability to retrieve highly relevant documents for RAG in the context of medical knowledge queries. This work presents a significant step forward in leveraging LLMs for accurate and comprehensive biomedical question answering.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Singapore (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (5 more...)
- Health & Medicine (1.00)
- Leisure & Entertainment > Games (0.68)
A Self-feedback Knowledge Elicitation Approach for Chemical Reaction Predictions
Liu, Pengfei, Tao, Jun, Ren, Zhixiang
The task of chemical reaction predictions (CRPs) plays a pivotal role in advancing drug discovery and material science. However, its effectiveness is constrained by the vast and uncertain chemical reaction space and challenges in capturing reaction selectivity, particularly due to existing methods' limitations in exploiting the data's inherent knowledge. To address these challenges, we introduce a data-curated self-feedback knowledge elicitation approach. This method starts from iterative optimization of molecular representations and facilitates the extraction of knowledge on chemical reaction types (RTs). Then, we employ adaptive prompt learning to infuse the prior knowledge into the large language model (LLM). As a result, we achieve significant enhancements: a 14.2% increase in retrosynthesis prediction accuracy, a 74.2% rise in reagent prediction accuracy, and an expansion in the model's capability for handling multi-task chemical reactions. This research offers a novel paradigm for knowledge elicitation in scientific research and showcases the untapped potential of LLMs in CRPs.
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > Michigan (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Research Report (1.00)
- Overview (1.00)
- Information Technology > Knowledge Management > Knowledge Engineering (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Analyzing Wrap-Up Effects through an Information-Theoretic Lens
Meister, Clara, Pimentel, Tiago, Clark, Thomas Hikaru, Cotterell, Ryan, Levy, Roger
Numerous analyses of reading time (RT) data have been implemented -- all in an effort to better understand the cognitive processes driving reading comprehension. However, data measured on words at the end of a sentence -- or even at the end of a clause -- is often omitted due to the confounding factors introduced by so-called "wrap-up effects," which manifests as a skewed distribution of RTs for these words. Consequently, the understanding of the cognitive processes that might be involved in these wrap-up effects is limited. In this work, we attempt to learn more about these processes by examining the relationship between wrap-up effects and information-theoretic quantities, such as word and context surprisals. We find that the distribution of information in prior contexts is often predictive of sentence- and clause-final RTs (while not of sentence-medial RTs). This lends support to several prior hypotheses about the processes involved in wrap-up effects.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
On the Effect of Anticipation on Reading Times
Pimentel, Tiago, Meister, Clara, Wilcox, Ethan G., Levy, Roger, Cotterell, Ryan
Over the past two decades, numerous studies have demonstrated how less predictable (i.e., higher surprisal) words take more time to read. In general, these studies have implicitly assumed the reading process is purely responsive: Readers observe a new word and allocate time to process it as required. We argue that prior results are also compatible with a reading process that is at least partially anticipatory: Readers could make predictions about a future word and allocate time to process it based on their expectation. In this work, we operationalize this anticipation as a word's contextual entropy. We assess the effect of anticipation on reading by comparing how well surprisal and contextual entropy predict reading times on four naturalistic reading datasets: two self-paced and two eye-tracking. Experimentally, across datasets and analyses, we find substantial evidence for effects of contextual entropy over surprisal on a word's reading time (RT): in fact, entropy is sometimes better than surprisal in predicting a word's RT. Spillover effects, however, are generally not captured by entropy, but only by surprisal. Further, we hypothesize four cognitive mechanisms through which contextual entropy could impact RTs -- three of which we are able to design experiments to analyze. Overall, our results support a view of reading that is not just responsive, but also anticipatory.
- Europe > Switzerland > Zürich > Zürich (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
- (12 more...)
Improving Training and Inference of Face Recognition Models via Random Temperature Scaling
Shang, Lei, Huang, Mouxiao, Shi, Wu, Liu, Yuchen, Liu, Yang, Wang, Fei, Sun, Baigui, Xie, Xuansong, Qiao, Yu
Data uncertainty is commonly observed in the images for face recognition (FR). However, deep learning algorithms often make predictions with high confidence even for uncertain or irrelevant inputs. Intuitively, FR algorithms can benefit from both the estimation of uncertainty and the detection of out-of-distribution (OOD) samples. Taking a probabilistic view of the current classification model, the temperature scalar is exactly the scale of uncertainty noise implicitly added in the softmax function. Meanwhile, the uncertainty of images in a dataset should follow a prior distribution. Based on the observation, a unified framework for uncertainty modeling and FR, Random Temperature Scaling (RTS), is proposed to learn a reliable FR algorithm. The benefits of RTS are two-fold. (1) In the training phase, it can adjust the learning strength of clean and noisy samples for stability and accuracy. (2) In the test phase, it can provide a score of confidence to detect uncertain, low-quality and even OOD samples, without training on extra labels. Extensive experiments on FR benchmarks demonstrate that the magnitude of variance in RTS, which serves as an OOD detection metric, is closely related to the uncertainty of the input image. RTS can achieve top performance on both the FR and OOD detection tasks. Moreover, the model trained with RTS can perform robustly on datasets with noise. The proposed module is light-weight and only adds negligible computation cost to the model.
- Asia > China > Shanghai > Shanghai (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Asia > China > Beijing > Beijing (0.04)