keystroke
ImageTalk: Designing a Multimodal AAC Text Generation System Driven by Image Recognition and Natural Language Generation
Yang, Boyin, Jiang, Puming, Kristensson, Per Ola
People living with Motor Neuron Disease (plwMND) frequently encounter speech and motor impairments that necessitate a reliance on augmentative and alternative communication (AAC) systems. This paper tackles the main challenge that traditional symbol-based AAC systems offer a limited vocabulary, while text entry solutions tend to exhibit low communication rates. To help plwMND articulate their needs about the system efficiently and effectively, we iteratively design and develop a novel multimodal text generation system called ImageTalk through a tailored proxy-user-based and an end-user-based design phase. The system demonstrates pronounced keystroke savings of 95.6%, coupled with consistent performance and high user satisfaction. We distill three design guidelines for AI-assisted text generation systems design and outline four user requirement levels tailored for AAC purposes, guiding future research in this field.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (8 more...)
Typing Reinvented: Towards Hands-Free Input via sEMG
Lee, Kunwoo, Sreedhar, Dhivya, Saraf, Pushkar, Lee, Chaeeun, Shapovalenko, Kateryna
We explore surface electromyography (sEMG) as a non-invasive input modality for mapping muscle activity to keyboard inputs, targeting immersive typing in next-generation human-computer interaction (HCI). This is especially relevant for spatial computing and virtual reality (VR), where traditional keyboards are impractical. Using attention-based architectures, we significantly outperform the existing convolutional baselines, reducing online generic CER from 24.98% -> 20.34% and offline personalized CER from 10.86% -> 10.10%, while remaining fully causal. We further incorporate a lightweight decoding pipeline with language-model-based correction, demonstrating the feasibility of accurate, real-time muscle-driven text input for future wearable and spatial interfaces.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Mexico > Gulf of Mexico (0.04)
- Europe > Bulgaria > Sofia City Province > Sofia (0.04)
SplashNet: Split-and-Share Encoders for Accurate and Efficient Typing with Surface Electromyography
Hadidi, Nima, Chan, Jason, Feghhi, Ebrahim, Kao, Jonathan C.
Surface electromyography (sEMG) at the wrists could enable natural, keyboard-free text entry, yet the state-of-the-art emg2qwerty baseline still misrecognizes $51.8\%$ of characters in the zero-shot setting on unseen users and $7.0\%$ after user-specific fine-tuning. We trace many of these errors to mismatched cross-user signal statistics, fragile reliance on high-order feature dependencies, and the absence of architectural inductive biases aligned with the bilateral nature of typing. To address these issues, we introduce three simple modifications: (i) Rolling Time Normalization, which adaptively aligns input distributions across users; (ii) Aggressive Channel Masking, which encourages reliance on low-order feature combinations more likely to generalize across users; and (iii) a Split-and-Share encoder that processes each hand independently with weight-shared streams to reflect the bilateral symmetry of the neuromuscular system. Combined with a five-fold reduction in spectral resolution ($33\!\rightarrow\!6$ frequency bands), these components yield a compact Split-and-Share model, SplashNet-mini, which uses only $\tfrac14$ the parameters and $0.6\times$ the FLOPs of the baseline while reducing character-error rate (CER) to $36.4\%$ zero-shot and $5.9\%$ after fine-tuning. An upscaled variant, SplashNet ($\tfrac12$ the parameters, $1.15\times$ the FLOPs of the baseline), further lowers error to $35.7\%$ and $5.5\%$, representing relative improvements of $31\%$ and $21\%$ in the zero-shot and fine-tuned settings, respectively. SplashNet therefore establishes a new state of the art without requiring additional data.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
- Africa > Guinea > Kankan Region > Kankan Prefecture > Kankan (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
Cross-dataset Multivariate Time-series Model for Parkinson's Diagnosis via Keyboard Dynamics
Francesconi, Arianna, Cappetta, Donato, Rebecchi, Fabio, Soda, Paolo, Guarrasi, Valerio, Sicilia, Rosa
Parkinson's disease (PD) presents a growing global challenge, affecting over 10 million individuals, with prevalence expected to double by 2040. Early diagnosis remains difficult due to the late emergence of motor symptoms and limitations of traditional clinical assessments. In this study, we propose a novel pipeline that leverages keystroke dynamics as a non-invasive and scalable biomarker for remote PD screening and telemonitoring. Our methodology involves three main stages: (i) preprocessing of data from four distinct datasets, extracting four temporal signals and addressing class imbalance through the comparison of three methods; (ii) pre-training eight state-of-the-art deep-learning architectures on the two largest datasets, optimizing temporal windowing, stride, and other hyperparameters; (iii) fine-tuning on an intermediate-sized dataset and performing external validation on a fourth, independent cohort. Our results demonstrate that hybrid convolutional-recurrent and transformer-based models achieve strong external validation performance, with AUC-ROC scores exceeding 90% and F1-Score over 70%. Notably, a temporal convolutional model attains an AUC-ROC of 91.14% in external validation, outperforming existing methods that rely solely on internal validation. These findings underscore the potential of keystroke dynamics as a reliable digital biomarker for PD, offering a promising avenue for early detection and continuous monitoring.
The Behavioural Translation Style Space: Towards simulating the temporal dynamics of affect, behaviour, and cognition in human translation production
Carl, Michael, Mizowaki, Takanori, Ray, Aishvarya, Yamada, Masaru, Bandaru, Devi Sri, Ren, Xinyue
The paper introduces a novel behavioural translation style space (BTSS) that describes possible behavioural translation patterns. The suggested BTSS is organized as a hierarchical structure that entails various embedded processing layers. We posit that observable translation behaviour - i.e. eye and finger movements - is fundamental when executing the physical act of translation but it is caused and shaped by higher-order cognitive processes and affective translation states. We analyse records of keystrokes and gaze data as indicators of the hidden mental processing structure and organize the behavioural patterns as a multi-layered embedded BTSS. We develop a perspective in which the BTSS serves as the basis for a computational translation agent to simulate the temporal dynamics of affect, behavioural routines and cognition during human translation production.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > United States > New York (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- (9 more...)
The 5 best mechanical keyboards for 2025
Your keyboard is one of the few pieces of technology you'll use for hours at a time, so why not make it something that brings you joy? Sure, the people who gush over mechanical keyboards can be a bit much, but the enhanced comfort, durability and customizability that comes with the best of them is real. If you're interested in making the switch (ahem), we've tested dozens of mechanical keyboards over the past year and rounded up our favorites below. We've also broken down what to look for as you shop. The first thing to decide with any keyboard is what size and layout you want. Full-size layouts have all the keys you'd ever need -- a number pad, a full function row, arrow keys, etc. -- but they also have the largest physical footprint. A 96-percent or "1800" keyboard is similar, but crunches the navigation cluster (Page Up, Home, etc.), numpad and arrow keys closer together to save space. Tenkeyless (TKL) or 80-percent keyboards omit the number pad entirely; they're often considered the best blend of size and functionality. It gets more and more minimal from there. The smallest popular layout is the 60 percent keyboard, which removes the arrow keys, function row, numpad and navigation cluster. This kind of design can be particularly useful for gaming, as it opens up a ton of desk space to swing your mouse around. It typically relies on shortcuts to make up for its missing keys, but it comes with a learning curve as a result. Even more compact options exist beyond that. These can be adorable, but they usually involve removing the number row, which is a step too far for most people.
- Leisure & Entertainment > Games > Computer Games (0.95)
- Materials (0.68)
- Information Technology > Hardware (0.47)
- Information Technology > Software (0.46)
- Information Technology > Artificial Intelligence (0.34)
ScholaWrite: A Dataset of End-to-End Scholarly Writing Process
Wang, Linghe, Lee, Minhwa, Volkov, Ross, Chau, Luan Tuyen, Kang, Dongyeop
Writing is a cognitively demanding task involving continuous decision-making, heavy use of working memory, and frequent switching between multiple activities. Scholarly writing is particularly complex as it requires authors to coordinate many pieces of multiform knowledge. To fully understand writers' cognitive thought process, one should fully decode the end-to-end writing data (from individual ideas to final manuscript) and understand their complex cognitive mechanisms in scholarly writing. We introduce ScholaWrite dataset, a first-of-its-kind keystroke corpus of an end-to-end scholarly writing process for complete manuscripts, with thorough annotations of cognitive writing intentions behind each keystroke. Our dataset includes LaTeX-based keystroke data from five preprints with nearly 62K total text changes and annotations across 4 months of paper writing. ScholaWrite shows promising usability and applications (e.g., iterative self-writing), demonstrating the importance of collection of end-to-end writing data, rather than the final manuscript, for the development of future writing assistants to support the cognitive thinking process of scientists. Our de-identified data examples and code are available on our project page.
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- North America > United States > Virginia (0.04)
- (3 more...)
- Research Report > New Finding (0.67)
- Research Report > Experimental Study (0.48)
- Education (1.00)
- Information Technology > Security & Privacy (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Communications (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance
Devatine, Nicolas, Abraham, Louis
Assessing the extent of human edits on texts generated by Large Language Models (LLMs) is crucial to understanding the human-AI interactions and improving the quality of automated text generation systems. Existing edit distance metrics, such as Levenshtein, BLEU, ROUGE, and TER, often fail to accurately measure the effort required for post-editing, especially when edits involve substantial modifications, such as block operations. In this paper, we introduce a novel compression-based edit distance metric grounded in the Lempel-Ziv-77 algorithm, designed to quantify the amount of post-editing applied to LLM-generated texts. Our method leverages the properties of text compression to measure the informational difference between the original and edited texts. Through experiments on real-world human edits datasets, we demonstrate that our proposed metric is highly correlated with actual edit time and effort. We also show that LLMs exhibit an implicit understanding of editing speed, that aligns well with our metric. Furthermore, we compare our metric with existing ones, highlighting its advantages in capturing complex edits with linear computational efficiency. Our code and data are available at: https://github.com/NDV-tiime/CompressionDistance
- North America > United States > Pennsylvania (0.04)
- North America > United States > New Jersey (0.04)
- North America > United States > Michigan (0.04)
- (9 more...)
ChatBCI: A P300 Speller BCI Leveraging Large Language Models for Improved Sentence Composition in Realistic Scenarios
Hong, Jiazhen, Wang, Weinan, Najafizadeh, Laleh
P300 speller BCIs allow users to compose sentences by selecting target keys on a GUI through the detection of P300 component in their EEG signals following visual stimuli. Most P300 speller BCIs require users to spell words letter by letter, or the first few initial letters, resulting in high keystroke demands that increase time, cognitive load, and fatigue. This highlights the need for more efficient, user-friendly methods for faster sentence composition. In this work, we introduce ChatBCI, a P300 speller BCI that leverages the zero-shot learning capabilities of large language models (LLMs) to suggest words from user-spelled initial letters or predict the subsequent word(s), reducing keystrokes and accelerating sentence composition. ChatBCI retrieves word suggestions through remote queries to the GPT-3.5 API. A new GUI, displaying GPT-3.5 word suggestions as extra keys is designed. SWLDA is used for the P300 classification. Seven subjects completed two online spelling tasks: 1) copy-spelling a self-composed sentence using ChatBCI, and 2) improvising a sentence using ChatBCI's word suggestions. Results demonstrate that in Task 1, on average, ChatBCI outperforms letter-by-letter BCI spellers, reducing time and keystrokes by 62.14% and 53.22%, respectively, and increasing information transfer rate by 198.96%. In Task 2, ChatBCI achieves 80.68% keystroke savings and a record 8.53 characters/min for typing speed. Overall, ChatBCI, by employing remote LLM queries, enhances sentence composition in realistic scenarios, significantly outperforming traditional spellers without requiring local model training or storage. ChatBCI's (multi-) word predictions, combined with its new GUI, pave the way for developing next-generation speller BCIs that are efficient and effective for real-time communication, especially for users with communication and motor disabilities.
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.46)
Human-Robot Cooperative Piano Playing with Learning-Based Real-Time Music Accompaniment
Wang, Huijiang, Zhang, Xiaoping, Iida, Fumiya
Recent advances in machine learning have paved the way for the development of musical and entertainment robots. However, human-robot cooperative instrument playing remains a challenge, particularly due to the intricate motor coordination and temporal synchronization. In this paper, we propose a theoretical framework for human-robot cooperative piano playing based on non-verbal cues. First, we present a music improvisation model that employs a recurrent neural network (RNN) to predict appropriate chord progressions based on the human's melodic input. Second, we propose a behavior-adaptive controller to facilitate seamless temporal synchronization, allowing the cobot to generate harmonious acoustics. The collaboration takes into account the bidirectional information flow between the human and robot. We have developed an entropy-based system to assess the quality of cooperation by analyzing the impact of different communication modalities during human-robot collaboration. Experiments demonstrate that our RNN-based improvisation can achieve a 93\% accuracy rate. Meanwhile, with the MPC adaptive controller, the robot could respond to the human teammate in homophony performances with real-time accompaniment. Our designed framework has been validated to be effective in allowing humans and robots to work collaboratively in the artistic piano-playing task.
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Energy > Oil & Gas > Downstream (0.34)