Goto

Collaborating Authors

 score fusion


RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a Deep-Learning Word Prediction Framework

arXiv.org Artificial Intelligence

Text entry is a critical capability for any modern computing experience, with lightweight augmented reality (AR) glasses being no exception. Designed for all-day wearability, a limitation of lightweight AR glass is the restriction to the inclusion of multiple cameras for extensive field of view in hand tracking. This constraint underscores the need for an additional input device. We propose a system to address this gap: a ring-based mid-air gesture typing technique, RingGesture, utilizing electrodes to mark the start and end of gesture trajectories and inertial measurement units (IMU) sensors for hand tracking. This method offers an intuitive experience similar to raycast-based mid-air gesture typing found in VR headsets, allowing for a seamless translation of hand movements into cursor navigation. To enhance both accuracy and input speed, we propose a novel deep-learning word prediction framework, Score Fusion, comprised of three key components: a) a word-gesture decoding model, b) a spatial spelling correction model, and c) a lightweight contextual language model. In contrast, this framework fuses the scores from the three models to predict the most likely words with higher precision. We conduct comparative and longitudinal studies to demonstrate two key findings: firstly, the overall effectiveness of RingGesture, which achieves an average text entry speed of 27.3 words per minute (WPM) and a peak performance of 47.9 WPM. Secondly, we highlight the superior performance of the Score Fusion framework, which offers a 28.2% improvement in uncorrected Character Error Rate over a conventional word prediction framework, Naive Correction, leading to a 55.2% improvement in text entry speed for RingGesture. Additionally, RingGesture received a System Usability Score of 83 signifying its excellent usability.


Class-Incremental Learning with Strong Pre-trained Models

arXiv.org Artificial Intelligence

Class-incremental learning (CIL) has been widely studied under the setting of starting from a small number of classes (base classes). Instead, we explore an understudied real-world setting of CIL that starts with a strong model pre-trained on a large number of base classes. We hypothesize that a strong base model can provide a good representation for novel classes and incremental learning can be done with small adaptations. We propose a 2-stage training scheme, i) feature augmentation -- cloning part of the backbone and fine-tuning it on the novel data, and ii) fusion -- combining the base and novel classifiers into a unified classifier. Experiments show that the proposed method significantly outperforms state-of-the-art CIL methods on the large-scale ImageNet dataset (e.g. +10% overall accuracy than the best). We also propose and analyze understudied practical CIL scenarios, such as base-novel overlap with distribution shift. Our proposed method is robust and generalizes to all analyzed CIL settings. Code is available at https://github.com/amazon-research/sp-cil.


Score Fusion Based Authorship Attribution of Ancient Arabic Texts

AAAI Conferences

In this paper, we investigate the authorship of several short historical texts that are written by ten ancient Arabic travelers: this Arabic dataset, which was collected by the authors in 2011, and called AAAT (Authorship attribution of Ancient Arabic Texts) corpus, is considered as a reference dataset in Arabic. Several experiments of authorship attribution are conducted by using different features namely: characters, character n-grams, and lexical features such as words, word n-grams, and rare words. On the other hand, different classifiers are employed, such as: statistical distances, Multi Layer Percep-tron (MLP), Support Vector Machines (SVM) and Linear Regression (LR). In this investigation, a new fusion technique is proposed to enhance the overall performances of the classifiers: it is called Score Based Fusion (SBF). Results show good attribution performances with an optimal score between 80% and 90% of good authorship attribution. The proposed fusion technique raised this score to 100% of good authorship attribution. Moreover, this comparative survey has revealed interesting results concerning the Arabic language and more particularly with short texts.