Goto

Collaborating Authors

 Accuracy


LipLearner: Customizable Silent Speech Interactions on Mobile Devices

arXiv.org Artificial Intelligence

Silent speech interface is a promising technology that enables private communications in natural language. However, previous approaches only support a small and inflexible vocabulary, which leads to limited expressiveness. We leverage contrastive learning to learn efficient lipreading representations, enabling few-shot command customization with minimal user effort. Our model exhibits high robustness to different lighting, posture, and gesture conditions on an in-the-wild dataset. For 25-command classification, an F1-score of 0.8947 is achievable only using one shot, and its performance can be further boosted by adaptively learning from more data. This generalizability allowed us to develop a mobile silent speech interface empowered with on-device fine-tuning and visual keyword spotting. A user study demonstrated that with LipLearner, users could define their own commands with high reliability guaranteed by an online incremental learning scheme. Subjective feedback indicated that our system provides essential functionalities for customizable silent speech interactions with high usability and learnability.


On the Capacity Limits of Privileged ERM

arXiv.org Machine Learning

We study the supervised learning paradigm called Learning Using Privileged Information, first suggested by Vapnik and Vashist (2009). In this paradigm, in addition to the examples and labels, additional (privileged) information is provided only for training examples. The goal is to use this information to improve the classification accuracy of the resulting classifier, where this classifier can only use the non-privileged information of new example instances to predict their label. We study the theory of privileged learning with the zero-one loss under the natural Privileged ERM algorithm proposed in Pechyony and Vapnik (2010a). We provide a counter example to a claim made in that work regarding the VC dimension of the loss class induced by this problem; We conclude that the claim is incorrect. We then provide a correct VC dimension analysis which gives both lower and upper bounds on the capacity of the Privileged ERM loss class. We further show, via a generalization analysis, that worst-case guarantees for Privileged ERM cannot improve over standard non-privileged ERM, unless the capacity of the privileged information is similar or smaller to that of the non-privileged information. This result points to an important limitation of the Privileged ERM approach. In our closing discussion, we suggest another way in which Privileged ERM might still be helpful, even when the capacity of the privileged information is large.


Investigating Group Distributionally Robust Optimization for Deep Imbalanced Learning: A Case Study of Binary Tabular Data Classification

arXiv.org Artificial Intelligence

Oversampling and undersampling are two common data resampling approaches used in DNN. Owing to increased data availability, novel learning However, the susceptibility of the former to noise and architectures and accessibility to commodity computational overfitting due to added samples [23] as well as the hardware devices, deep neural networks (DNNs) have become characteristic loss of valuable information peculiar with the the de facto tool for a wide range of machine learning (ML) latter [3] remain major drawbacks of this category of tasks in recent times; leading to state-of-the-art performance in imbalance methods. On the other hand, the core idea behind several computer vision, natural language processing and the cost sensitive methods is to assign different speech recognition tasks. DNNs are characterized by several misclassification cost/weights to the training samples to scale layers of hidden units that enable learning of useful up/down the misclassification errors depending on the class representations of a given data for improved model they belong [17, 24]. While there are several implementations performance [1, 2]. This alleviates the need for domain experts of this method, the most commonly used cost sensitive and hand-engineered features, a common prerequisite for approach in imbalanced deep learning research is reweighting traditional ML methods.


Deep Attention Recognition for Attack Identification in 5G UAV scenarios: Novel Architecture and End-to-End Evaluation

arXiv.org Artificial Intelligence

Despite the robust security features inherent in the 5G framework, attackers will still discover ways to disrupt 5G unmanned aerial vehicle (UAV) operations and decrease UAV control communication performance in Air-to-Ground (A2G) links. Operating under the assumption that the 5G UAV communications infrastructure will never be entirely secure, we propose Deep Attention Recognition (DAtR) as a solution to identify attacks based on a small deep network embedded in authenticated UAVs. In the tested scenarios, a number of attackers are located in random positions, while their power is varied in each simulation. Moreover, terrestrial users are included in the network to impose additional complexity on attack detection. To improve the system's overall performance in the attack scenarios, we propose complementing the deep network decision with two mechanisms based on data manipulation and majority voting techniques. We compare several performance parameters in our proposed Deep Network. For example, the impact of Long Short-Term-Memory (LSTM) and Attention layers in terms of their overall accuracy, the window size effect, and test the accuracy when only partial data is available in the training process. Finally, we benchmark our deep network with six widely used classifiers regarding classification accuracy. Our algorithm's accuracy exceeds 4% compared with the eXtreme Gradient Boosting (XGB) classifier in LoS condition and around 3% in the short distance NLoS condition. Considering the proposed deep network, all other classifiers present lower accuracy than XGB. UAVs will play a crucial role in emergency response [1, 2], package delivery in the logistics industry, and in temporal events, [2]. UAVs are becoming more common and reliable [3] due to technological advancements [4, 5], as well as the improvements in energy-efficient UAV's trajectory optimizations algorithms to be feasible in practice to take into account the dynamics of the UAV as a parametrized method [6, 7, 8], thus integrating UAVs into 5G and 6G networks will increase telecommunication coverage and reduce costs for businesses willing to invest in this technology. However, UAVs can easily be hacked by malicious users [9] throughout their wireless communication channels, which might divert delivery packets from their destinations. This can have disastrous consequences in unfortunate climate events where UAVs are transporting people to hospitals, or in cases of criminal investigations.


Domain adaptation using optimal transport for invariant learning using histopathology datasets

arXiv.org Artificial Intelligence

Histopathology is critical for the diagnosis of many diseases, including cancer. These protocols typically require pathologists to manually evaluate slides under a microscope, which is time-consuming and subjective, leading to interest in machine learning to automate analysis. However, computational techniques are limited by batch effects, where technical factors like differences in preparation protocol or scanners can alter the appearance of slides, causing models trained on one institution to fail when generalizing to others. Here, we propose a domain adaptation method that improves the generalization of histopathological models to data from unseen institutions, without the need for labels or retraining in these new settings. Our approach introduces an optimal transport (OT) loss, that extends adversarial methods that penalize models if images from different institutions can be distinguished in their representation space. Unlike previous methods, which operate on single samples, our loss accounts for distributional differences between batches of images. We show that on the Camelyon17 dataset, while both methods can adapt to global differences in color distribution, only our OT loss can reliably classify a cancer phenotype unseen during training. Together, our results suggest that OT improves generalization on rare but critical phenotypes that may only make up a small fraction of the total tiles and variation in a slide.


Towards Adversarial Realism and Robust Learning for IoT Intrusion Detection and Classification

arXiv.org Artificial Intelligence

The Internet of Things (IoT) faces tremendous security challenges. Machine learning models can be used to tackle the growing number of cyber-attack variations targeting IoT systems, but the increasing threat posed by adversarial attacks restates the need for reliable defense strategies. This work describes the types of constraints required for a realistic adversarial cyber-attack example and proposes a methodology for a trustworthy adversarial robustness analysis with a realistic adversarial evasion attack vector. The proposed methodology was used to evaluate three supervised algorithms, Random Forest (RF), Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LGBM), and one unsupervised algorithm, Isolation Forest (IFOR). Constrained adversarial examples were generated with the Adaptative Perturbation Pattern Method (A2PM), and evasion attacks were performed against models created with regular and adversarial training. Even though RF was the least affected in binary classification, XGB consistently achieved the highest accuracy in multi-class classification. The obtained results evidence the inherent susceptibility of tree-based algorithms and ensembles to adversarial evasion attacks and demonstrates the benefits of adversarial training and a security by design approach for a more robust IoT network intrusion detection and cyber-attack classification.


Mapping Wordnets on the Fly with Permanent Sense Keys

arXiv.org Artificial Intelligence

Most of the major databases on the semantic web have links to Princeton WordNet (PWN) synonym set (synset) identifiers, which differ for each PWN release, and are thus incompatible between versions. On the other hand, both PWN and the more recent Open English Wordnet (OEWN) provide permanent word sense identifiers (the sense keys), which can solve this interoperability problem. We present an algorithm that runs in linear time, to automatically derive a synset mapping between any pair of Wordnet versions that use PWN sense keys. This allows to update old WordNet links, and seamlessly interoperate with newer English Wordnet versions for which no prior mapping exists. By applying the proposed algorithm on the fly, at load time, we combine the Open Multilingual Wordnet (OMW 1.4, which uses old PWN 3.0 identifiers) with OEWN Edition 2021, and obtain almost perfect precision and recall. We compare the results of our approach using respectively synset offsets, versus the Collaborative InterLingual Index (CILI version 1.0) as synset identifiers, and find that the synset offsets perform better than CILI 1.0 in all cases, except a few ties.


Backdoor Attacks and Defenses in Federated Learning: Survey, Challenges and Future Research Directions

arXiv.org Artificial Intelligence

Federated learning (FL) is a machine learning (ML) approach that allows the use of distributed data without compromising personal privacy. However, the heterogeneous distribution of data among clients in FL can make it difficult for the orchestration server to validate the integrity of local model updates, making FL vulnerable to various threats, including backdoor attacks. Backdoor attacks involve the insertion of malicious functionality into a targeted model through poisoned updates from malicious clients. These attacks can cause the global model to misbehave on specific inputs while appearing normal in other cases. Backdoor attacks have received significant attention in the literature due to their potential to impact real-world deep learning applications. However, they have not been thoroughly studied in the context of FL. In this survey, we provide a comprehensive survey of current backdoor attack strategies and defenses in FL, including a comprehensive analysis of different approaches. We also discuss the challenges and potential future directions for attacks and defenses in the context of FL.


Unsupervised Recycled FPGA Detection Using Symmetry Analysis

arXiv.org Artificial Intelligence

Recently, recycled field-programmable gate arrays (FPGAs) pose a significant hardware security problem due to the proliferation of the semiconductor supply chain. Ring oscillator (RO) based frequency analyzing technique is one of the popular methods, where most studies used the known fresh FPGAs (KFFs) in machine learning-based detection, which is not a realistic approach. In this paper, we present a novel recycled FPGA detection method by examining the symmetry information of the RO frequency using unsupervised anomaly detection method. Due to the symmetrical array structure of the FPGA, some adjacent logic blocks on an FPGA have comparable RO frequencies, hence our method simply analyzes the RO frequencies of those blocks to determine how similar they are. The proposed approach efficiently categorizes recycled FPGAs by utilizing direct density ratio estimation through outliers detection. Experiments using Xilinx Artix-7 FPGAs demonstrate that the proposed method accurately classifies recycled FPGAs from 10 fresh FPGAs by x fewer computations compared with the conventional method.


Intelligent O-RAN Traffic Steering for URLLC Through Deep Reinforcement Learning

arXiv.org Artificial Intelligence

The goal of Next-Generation Networks is to improve upon the current networking paradigm, especially in providing higher data rates, near-real-time latencies, and near-perfect quality of service. However, existing radio access network (RAN) architectures lack sufficient flexibility and intelligence to meet those demands. Open RAN (O-RAN) is a promising paradigm for building a virtualized and intelligent RAN architecture. This paper presents a Machine Learning (ML)-based Traffic Steering (TS) scheme to predict network congestion and then proactively steer O-RAN traffic to avoid it and reduce the expected queuing delay. To achieve this, we propose an optimized setup focusing on safeguarding both latency and reliability to serve URLLC applications. The proposed solution consists of a two-tiered ML strategy based on Naive Bayes Classifier and deep Q-learning. Our solution is evaluated against traditional reactive TS approaches that are offered as xApps in O-RAN and shows an average of 15.81 percent decrease in queuing delay across all deployed SFCs.