Goto

Collaborating Authors

 Bucharest


Tactile-based Object Retrieval From Granular Media

arXiv.org Artificial Intelligence

We introduce GEOTACT, a robotic manipulation method capable of retrieving objects buried in granular media. This is a challenging task due to the need to interact with granular media, and doing so based exclusively on tactile feedback, since a buried object can be completely hidden from vision. Tactile feedback is in itself challenging in this context, due to ubiquitous contact with the surrounding media, and the inherent noise level induced by the tactile readings. To address these challenges, we use a learning method trained end-to-end with simulated sensor noise. We show that our problem formulation leads to the natural emergence of learned pushing behaviors that the manipulator uses to reduce uncertainty and funnel the object to a stable grasp despite spurious and noisy tactile readings. We also introduce a training curriculum that enables learning these behaviors in simulation, followed by zero-shot transfer to real hardware. To the best of our knowledge, GEOTACT is the first method to reliably retrieve a number of different objects from a granular environment, doing so on real hardware and with integrated tactile sensing. Videos and additional information can be found at https://jxu.ai/geotact.


How Can Generative AI Enhance the Well-being of Blind?

arXiv.org Artificial Intelligence

This paper examines the question of how generative AI can improve the well-being of blind or visually impaired people. It refers to a current example, the Be My Eyes app, in which the Be My AI feature was integrated in 2023, which is based on GPT-4 from OpenAI. The author's tests are described and evaluated. There is also an ethical and social discussion. The power of the tool, which can analyze still images in an amazing way, is demonstrated. Those affected gain a new independence and a new perception of their environment. At the same time, they are dependent on the world view and morality of the provider or developer, who prescribe or deny them certain descriptions. An outlook makes it clear that the analysis of moving images will mean a further leap forward. It is fair to say that generative AI can fundamentally improve the well-being of blind and visually impaired people and will change it in various ways.


Improving Large-Scale k-Nearest Neighbor Text Categorization with Label Autoencoders

arXiv.org Artificial Intelligence

In this paper, we introduce a multi-label lazy learning approach to deal with automatic semantic indexing in large document collections in the presence of complex and structured label vocabularies with high inter-label correlation. The proposed method is an evolution of the traditional k-Nearest Neighbors algorithm which uses a large autoencoder trained to map the large label space to a reduced size latent space and to regenerate the predicted labels from this latent space. We have evaluated our proposal in a large portion of the MEDLINE biomedical document collection which uses the Medical Subject Headings (MeSH) thesaurus as a controlled vocabulary. In our experiments we propose and evaluate several document representation approaches and different label autoencoder configurations.


Deep-Learning Channel Estimation for IRS-Assisted Integrated Sensing and Communication System

arXiv.org Artificial Intelligence

Integrated sensing and communication (ISAC), and intelligent reflecting surface (IRS) are envisioned as revolutionary technologies to enhance spectral and energy efficiencies for next wireless system generations. For the first time, this paper focuses on the channel estimation problem in an IRS-assisted ISAC system. This problem is challenging due to the lack of signal processing capacity in passive IRS, as well as the presence of mutual interference between sensing and communication (SAC) signals in ISAC systems. A three-stage approach is proposed to decouple the estimation problem into sub-ones, including the estimation of the direct SAC channels in the first stage, reflected communication channel in the second stage, and reflected sensing channel in the third stage. The proposed three-stage approach is based on a deep-learning framework, which involves two different convolutional neural network (CNN) architectures to estimate the channels at the full-duplex ISAC base station. Furthermore, two types of input-output pairs to train the CNNs are carefully designed, which affect the estimation performance under various signal-to-noise ratio conditions and system parameters. Simulation results validate the superiority of the proposed estimation approach compared to the least-squares baseline scheme, and its computational complexity is also analyzed.


Extreme Learning Machine-based Channel Estimation in IRS-Assisted Multi-User ISAC System

arXiv.org Artificial Intelligence

Multi-user integrated sensing and communication (ISAC) assisted by intelligent reflecting surface (IRS) has been recently investigated to provide a high spectral and energy efficiency transmission. This paper proposes a practical channel estimation approach for the first time to an IRS-assisted multiuser ISAC system. The estimation problem in such a system is challenging since the sensing and communication (SAC) signals interfere with each other, and the passive IRS lacks signal processing ability. A two-stage approach is proposed to transfer the overall estimation problem into sub-ones, successively including the direct and reflected channels estimation. Based on this scheme, the ISAC base station (BS) estimates all the SAC channels associated with the target and uplink users, while each downlink user estimates the downlink communication channels individually. Considering a low-cost demand of the ISAC BS and downlink users, the proposed two-stage approach is realized by an efficient neural network (NN) framework that contains two different extreme learning machine (ELM) structures to estimate the above SAC channels. Moreover, two types of input-output pairs to train the ELMs are carefully devised, which impact the estimation accuracy and computational complexity under different system parameters. Simulation results reveal a substantial performance improvement achieved by the proposed ELM-based approach over the least-squares and NN-based benchmarks, with reduced training complexity and faster training speed.


PILOT: Legal Case Outcome Prediction with Case Law

arXiv.org Artificial Intelligence

Machine learning shows promise in predicting the outcome of legal cases, but most research has concentrated on civil law cases rather than case law systems. We identified two unique challenges in making legal case outcome predictions with case law. First, it is crucial to identify relevant precedent cases that serve as fundamental evidence for judges during decision-making. Second, it is necessary to consider the evolution of legal principles over time, as early cases may adhere to different legal contexts. In this paper, we proposed a new model named PILOT (PredictIng Legal case OuTcome) for case outcome prediction. It comprises two modules for relevant case retrieval and temporal pattern handling, respectively. To benchmark the performance of existing legal case outcome prediction models, we curated a dataset from a large-scale case law database. We demonstrate the importance of accurately identifying precedent cases and mitigating the temporal shift when making predictions for case law, as our method shows a significant improvement over the prior methods that focus on civil law case outcome predictions.


CCNETS: A Novel Brain-Inspired Approach for Enhanced Pattern Recognition in Imbalanced Datasets

arXiv.org Artificial Intelligence

This study introduces CCNETS (Causal Learning with Causal Cooperative Nets), a novel generative model-based classifier designed to tackle the challenge of generating data for imbalanced datasets in pattern recognition. CCNETS is uniquely crafted to emulate brain-like information processing and comprises three main components: Explainer, Producer, and Reasoner. Each component is designed to mimic specific brain functions, which aids in generating high-quality datasets and enhancing classification performance. The model is particularly focused on addressing the common and significant challenge of handling imbalanced datasets in machine learning. CCNETS's effectiveness is demonstrated through its application to a "fraud dataset," where normal transactions significantly outnumber fraudulent ones (99.83% vs. 0.17%). Traditional methods often struggle with such imbalances, leading to skewed performance metrics. However, CCNETS exhibits superior classification ability, as evidenced by its performance metrics. Specifically, it achieved an F1-score of 0.7992, outperforming traditional models like Autoencoders and Multi-layer Perceptrons (MLP) in the same context. This performance indicates CCNETS's proficiency in more accurately distinguishing between normal and fraudulent patterns. The innovative structure of CCNETS enhances the coherence between generative and classification models, helping to overcome the limitations of pattern recognition that rely solely on generative models. This study emphasizes CCNETS's potential in diverse applications, especially where quality data generation and pattern recognition are key. It proves effective in machine learning, particularly for imbalanced datasets. CCNETS overcomes current challenges in these datasets and advances machine learning with brain-inspired approaches.


Explainability-Driven Leaf Disease Classification Using Adversarial Training and Knowledge Distillation

arXiv.org Artificial Intelligence

This work focuses on plant leaf disease classification and explores three crucial aspects: adversarial training, model explainability, and model compression. The models' robustness against adversarial attacks is enhanced through adversarial training, ensuring accurate classification even in the presence of threats. Leveraging explainability techniques, we gain insights into the model's decision-making process, improving trust and transparency. Additionally, we explore model compression techniques to optimize computational efficiency while maintaining classification performance. Through our experiments, we determine that on a benchmark dataset, the robustness can be the price of the classification accuracy with performance reductions of 3%-20% for regular tests and gains of 50%-70% for adversarial attack tests. We also demonstrate that a student model can be 15-25 times more computationally efficient for a slight performance reduction, distilling the knowledge of more complex models.


Advancements in eHealth Data Analytics through Natural Language Processing and Deep Learning

arXiv.org Artificial Intelligence

The healthcare environment is commonly referred to as "information-rich" but also "knowledge poor". Healthcare systems collect huge amounts of data from various sources: lab reports, medical letters, logs of medical tools or programs, medical prescriptions, etc. These massive sets of data can provide great knowledge and information that can improve the medical services, and overall the healthcare domain, such as disease prediction by analyzing the patient's symptoms or disease prevention, by facilitating the discovery of behavioral factors for diseases. Unfortunately, only a relatively small volume of the textual eHealth data is processed and interpreted, an important factor being the difficulty in efficiently performing Big Data operations. In the medical field, detecting domain-specific multi-word terms is a crucial task as they can define an entire concept with a few words. A term can be defined as a linguistic structure or a concept, and it is composed of one or more words with a specific meaning to a domain. All the terms of a domain create its terminology. This chapter offers a critical study of the current, most performant solutions for analyzing unstructured (image and textual) eHealth data. This study also provides a comparison of the current Natural Language Processing and Deep Learning techniques in the eHealth context. Finally, we examine and discuss some of the current issues, and we define a set of research directions in this area.


DeepEdit: Knowledge Editing as Decoding with Constraints

arXiv.org Artificial Intelligence

We develop a new perspective of knowledge editing for large language models (LLMs) as decoding with constraints. We propose DeepEdit (Depth-first Search based Progressive Decoding for Knowledge Editing), a neuro-symbolic method that improves knowledge editing with better coherence of reasoning, relevance to the question, and awareness of updated knowledge. DeepEdit can be flexibly applied to all black-box LLMs: it does not require any access to the model parameters, representations, or output vocabulary distributions. DeepEdit progressively produces the high-quality reasoning steps towards effective knowledge editing. It utilizes a depth-first search to revise the LLMs' output, which improves the output's informativeness to the input question and awareness of the updated knowledge. Qualitatively, DeepEdit effectively controls LLMs to produce more succinct reasoning in accord with knowledge editing. Quantitatively, DeepEdit yields significant gains on MQuaKE, a challenging multi-hop question-answering dataset with knowledge editing. We release the source code at https://github.com/wangywUST/DeepEdit.