Goto

Collaborating Authors

 Overview


Natural language processing on customer note data

arXiv.org Artificial Intelligence

Automatic analysis of customer data for businesses is an area that is of interest to companies. Business to business data is studied rarely in academia due to the sensitive nature of such information. Applying natural language processing can speed up the analysis of prohibitively large sets of data. This paper addresses this subject and applies sentiment analysis, topic modelling and keyword extraction to a B2B data set. We show that accurate sentiment can be extracted from the notes automatically and the notes can be sorted by relevance into different topics. We see that without clear separation topics can lack relevance to a business context.


A Survey on Efficient Training of Transformers

arXiv.org Artificial Intelligence

Recent advances in Transformers have come with a huge requirement on computing resources, highlighting the importance of developing efficient training techniques to make Transformer training faster, at lower cost, and to higher accuracy by the efficient use of computation and memory resources. This survey provides the first systematic overview of the efficient training of Transformers, covering the recent progress in acceleration arithmetic and hardware, with a focus on the former. We analyze and compare methods that save computation and memory costs for intermediate tensors during training, together with techniques on hardware/algorithm co-design. We finally discuss challenges and promising areas for future research.


Interpretable Scientific Discovery with Symbolic Regression: A Review

arXiv.org Artificial Intelligence

Symbolic Regression (SR) is a rapidly growing subfield within machine learning (ML) to infer symbolic mathematical expressions from data [1, 2]. Interest in SR is being driven by the observation that it is not sufficient to only have accurate predictive models; however, it is often necessary that the learned models be interpretable [3]. A model is interpretable if the relationship between the input and output of the model can be logically or mathematically traced in a succinct manner. In other words, learnable models are interpretable if expressed as mathematical equations. As "disciplines" become increasingly data-rich and adopt ML techniques, the demand for interpretable models is likely to grow. For example, in the natural sciences (e.g., physics), mathematical models derived from first principles make it possible to reason about the underlying phenomenon in a way that is not possible with predictive models like deep neural networks. In critical disciplines like healthcare, non-interpretable models may never be allowed to be deployed - however accurate they maybe [4].


Safe Autonomous Driving in Adverse Weather: Sensor Evaluation and Performance Monitoring

arXiv.org Artificial Intelligence

The vehicle's perception sensors radar, lidar and camera, which must work continuously and without restriction, especially with regard to automated/autonomous driving, can lose performance due to unfavourable weather conditions. This paper analyzes the sensor signals of these three sensor technologies under rain and fog as well as day and night. A data set of a driving test vehicle as an object target under different weather conditions was recorded in a controlled environment with adjustable, defined, and reproducible weather conditions. Based on the sensor performance evaluation, a method has been developed to detect sensor degradation, including determining the affected data areas and estimating how severe they are. Through this sensor monitoring, measures can be taken in subsequent algorithms to reduce the influences or to take them into account in safety and assistance systems to avoid malfunctions.


Stars Are All You Need: A Distantly Supervised Pyramid Network for Document-Level End-to-End Sentiment Analysis

arXiv.org Artificial Intelligence

In this paper, we propose document-level end-to-end sentiment analysis to efficiently understand aspect and review sentiment expressed in online reviews in a unified manner. In particular, we assume that star rating labels are a "coarse-grained synthesis" of aspect ratings across in the review. We propose a Distantly Supervised Pyramid Network (DSPN) to efficiently perform Aspect-Category Detection, Aspect-Category Sentiment Analysis, and Rating Prediction using only document star rating labels for training. By performing these three related sentiment subtasks in an end-to-end manner, DSPN can extract aspects mentioned in the review, identify the corresponding sentiments, and predict the star rating labels. We evaluate DSPN on multi-aspect review datasets in English and Chinese and find that with only star rating labels for supervision, DSPN can perform comparably well to a variety of benchmark models. We also demonstrate the interpretability of DSPN's outputs on reviews to show the pyramid structure inherent in document level end-to-end sentiment analysis.


Analysis of different temporal graph neural network configurations on dynamic graphs

arXiv.org Artificial Intelligence

In recent years, there has been an increasing interest in the use of graph neural networks (GNNs) for analyzing dynamic graphs, which are graphs that evolve over time. However, there is still a lack of understanding of how different temporal graph neural network (TGNs) configurations can impact the accuracy of predictions on dynamic graphs. Moreover, the hunt for benchmark datasets for these TGNs models is still ongoing. Up until recently, Pytorch Geometric Temporal came up with a few benchmark datasets but most of these datasets have not been analyzed with different TGN models to establish the state-of-the-art. Therefore, this project aims to address this gap in the literature by performing a qualitative analysis of spatial-temporal dependence structure learning on dynamic graphs, as well as a comparative study of the effectiveness of selected TGNs on node and edge prediction tasks. Additionally, an extensive ablation study will be conducted on different variants of the best-performing TGN to identify the key factors contributing to its performance. By achieving these objectives, this project will provide valuable insights into the design and optimization of TGNs for dynamic graph analysis, with potential applications in areas such as disease spread prediction, social network analysis, traffic prediction, and more. Moreover, an attempt is made to convert snapshot-based data to the event-based dataset and make it compatible with the SOTA model namely TGN to perform node regression task.


SemEval 2023 Task 6: LegalEval - Understanding Legal Texts

arXiv.org Artificial Intelligence

In populous countries, pending legal cases have been growing exponentially. There is a need for developing NLP-based techniques for processing and automatically understanding legal documents. To promote research in the area of Legal NLP we organized the shared task LegalEval - Understanding Legal Texts at SemEval 2023. LegalEval task has three sub-tasks: Task-A (Rhetorical Roles Labeling) is about automatically structuring legal documents into semantically coherent units, Task-B (Legal Named Entity Recognition) deals with identifying relevant entities in a legal document and Task-C (Court Judgement Prediction with Explanation) explores the possibility of automatically predicting the outcome of a legal case along with providing an explanation for the prediction. In total 26 teams (approx. 100 participants spread across the world) submitted systems paper. In each of the sub-tasks, the proposed systems outperformed the baselines; however, there is a lot of scope for improvement. This paper describes the tasks, and analyzes techniques proposed by various teams.


Robustified Learning for Online Optimization with Memory Costs

arXiv.org Artificial Intelligence

Online optimization with memory costs has many real-world applications, where sequential actions are made without knowing the future input. Nonetheless, the memory cost couples the actions over time, adding substantial challenges. Conventionally, this problem has been approached by various expert-designed online algorithms with the goal of achieving bounded worst-case competitive ratios, but the resulting average performance is often unsatisfactory. On the other hand, emerging machine learning (ML) based optimizers can improve the average performance, but suffer from the lack of worst-case performance robustness. In this paper, we propose a novel expert-robustified learning (ERL) approach, achieving {both} good average performance and robustness. More concretely, for robustness, ERL introduces a novel projection operator that robustifies ML actions by utilizing an expert online algorithm; for average performance, ERL trains the ML optimizer based on a recurrent architecture by explicitly considering downstream expert robustification. We prove that, for any $\lambda\geq1$, ERL can achieve $\lambda$-competitive against the expert algorithm and $\lambda\cdot C$-competitive against the optimal offline algorithm (where $C$ is the expert's competitive ratio). Additionally, we extend our analysis to a novel setting of multi-step memory costs. Finally, our analysis is supported by empirical experiments for an energy scheduling application.


A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

arXiv.org Artificial Intelligence

Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks with different data modalities. A PFM (e.g., BERT, ChatGPT, and GPT-4) is trained on large-scale data which provides a reasonable parameter initialization for a wide range of downstream applications. BERT learns bidirectional encoder representations from Transformers, which are trained on large datasets as contextual language models. Similarly, the generative pretrained transformer (GPT) method employs Transformers as the feature extractor and is trained using an autoregressive paradigm on large datasets. Recently, ChatGPT shows promising success on large language models, which applies an autoregressive language model with zero shot or few shot prompting. The remarkable achievements of PFM have brought significant breakthroughs to various fields of AI. Numerous studies have proposed different methods, raising the demand for an updated survey. This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities. The review covers the basic components and existing pretraining methods used in natural language processing, computer vision, and graph learning. Additionally, it explores advanced PFMs used for different data modalities and unified PFMs that consider data quality and quantity. The review also discusses research related to the fundamentals of PFMs, such as model efficiency and compression, security, and privacy. Finally, the study provides key implications, future research directions, challenges, and open problems in the field of PFMs. Overall, this survey aims to shed light on the research of the PFMs on scalability, security, logical reasoning ability, cross-domain learning ability, and the user-friendly interactive ability for artificial general intelligence.


AI-Assisted Ethics? Considerations of AI Simulation for the Ethical Assessment and Design of Assistive Technologies

arXiv.org Artificial Intelligence

Current ethical debates on the use of artificial intelligence (AI) in health care treat AI as a product of technology in three ways: First, by assessing risks and potential benefits of currently developed AI-enabled products with ethical checklists; second, by proposing ex ante lists of ethical values seen as relevant for the design and development of assisting technology, and third, by promoting AI technology to use moral reasoning as part of the automation process. Subsequently, we propose a fourth approach to AI, namely as a methodological tool to assist ethical reflection. We provide a concept of an AI-simulation informed by three separate elements: 1) stochastic human behavior models based on behavioral data for simulating realistic settings, 2) qualitative empirical data on value statements regarding internal policy, and 3) visualization components that aid in understanding the impact of changes in these variables. The potential of this approach is to inform an interdisciplinary field about anticipated ethical challenges or ethical trade-offs in concrete settings and, hence, to spark a re-evaluation of design and implementation plans. This may be particularly useful for applications that deal with extremely complex values and behavior or with limitations on the communication resources of affected persons (e.g., persons with dementia care or for care of persons with cognitive impairment). Simulation does not replace ethical reflection but does allow for detailed, context-sensitive analysis during the design process and prior to implementation. Finally, we discuss the inherently quantitative methods of analysis afforded by stochastic simulations as well as the potential for ethical discussions and how simulations with AI can improve traditional forms of thought experiments and future-oriented technology assessment.