Goto

Collaborating Authors

 Edmonton


adaptNMT: an open-source, language-agnostic development environment for Neural Machine Translation

arXiv.org Artificial Intelligence

adaptNMT streamlines all processes involved in the development and deployment of RNN and Transformer neural translation models. As an open-source application, it is designed for both technical and non-technical users who work in the field of machine translation. Built upon the widely-adopted OpenNMT ecosystem, the application is particularly useful for new entrants to the field since the setup of the development environment and creation of train, validation and test splits is greatly simplified. Graphing, embedded within the application, illustrates the progress of model training, and SentencePiece is used for creating subword segmentation models. Hyperparameter customization is facilitated through an intuitive user interface, and a single-click model development approach has been implemented. Models developed by adaptNMT can be evaluated using a range of metrics, and deployed as a translation service within the application. To support eco-friendly research in the NLP space, a green report also flags the power consumption and kgCO$_{2}$ emissions generated during model development. The application is freely available.


Enhancing Neural Machine Translation of Low-Resource Languages: Corpus Development, Human Evaluation and Explainable AI Architectures

arXiv.org Artificial Intelligence

In the current machine translation (MT) landscape, the Transformer architecture stands out as the gold standard, especially for high-resource language pairs. This research delves into its efficacy for low-resource language pairs including both the English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi language pairs. Notably, the study identifies the optimal hyperparameters and subword model type to significantly improve the translation quality of Transformer models for low-resource language pairs. The scarcity of parallel datasets for low-resource languages can hinder MT development. To address this, gaHealth was developed, the first bilingual corpus of health data for the Irish language. Focusing on the health domain, models developed using this in-domain dataset exhibited very significant improvements in BLEU score when compared with models from the LoResMT2021 Shared Task. A subsequent human evaluation using the multidimensional quality metrics error taxonomy showcased the superior performance of the Transformer system in reducing both accuracy and fluency errors compared to an RNN-based counterpart. Furthermore, this thesis introduces adaptNMT and adaptMLLM, two open-source applications streamlined for the development, fine-tuning, and deployment of neural machine translation models. These tools considerably simplify the setup and evaluation process, making MT more accessible to both developers and translators. Notably, adaptNMT, grounded in the OpenNMT ecosystem, promotes eco-friendly natural language processing research by highlighting the environmental footprint of model development. Fine-tuning of MLLMs by adaptMLLM demonstrated advancements in translation performance for two low-resource language pairs: English$\leftrightarrow$Irish and English$\leftrightarrow$Marathi, compared to baselines from the LoResMT2021 Shared Task.


A Hybrid Model for Traffic Incident Detection based on Generative Adversarial Networks and Transformer Model

arXiv.org Artificial Intelligence

In addition to enhancing traffic safety and facilitating prompt emergency response, traffic incident detection plays an indispensable role in intelligent transportation systems by providing real-time traffic status information. This enables the realization of intelligent traffic control and management. Previous research has identified that apart from employing advanced algorithmic models, the effectiveness of detection is also significantly influenced by challenges related to acquiring large datasets and addressing dataset imbalances. A hybrid model combining transformer and generative adversarial networks (GANs) is proposed to address these challenges. Experiments are conducted on four real datasets to validate the superiority of the transformer in traffic incident detection. Additionally, GANs are utilized to expand the dataset and achieve a balanced ratio of 1:4, 2:3, and 1:1. The proposed model is evaluated against the baseline model. The results demonstrate that the proposed model enhances the dataset size, balances the dataset, and improves the performance of traffic incident detection in various aspects.


Optimal Robot Formations: Balancing Range-Based Observability and User-Defined Configurations

arXiv.org Artificial Intelligence

This paper introduces a set of customizable and novel cost functions that enable the user to easily specify desirable robot formations, such as a ``high-coverage'' infrastructure-inspection formation, while maintaining high relative pose estimation accuracy. The overall cost function balances the need for the robots to be close together for good ranging-based relative localization accuracy and the need for the robots to achieve specific tasks, such as minimizing the time taken to inspect a given area. The formations found by minimizing the aggregated cost function are evaluated in a coverage path planning task in simulation and experiment, where the robots localize themselves and unknown landmarks using a simultaneous localization and mapping algorithm based on the extended Kalman filter. Compared to an optimal formation that maximizes ranging-based relative localization accuracy, these formations significantly reduce the time to cover a given area with minimal impact on relative pose estimation accuracy.


A Synergistic Approach to Wildfire Prevention and Management Using AI, ML, and 5G Technology in the United States

arXiv.org Artificial Intelligence

Over the past few years, wildfires have become a worldwide environmental emergency, resulting in substantial harm to natural habitats and playing a part in the acceleration of climate change. Wildfire management methods involve prevention, response, and recovery efforts. Despite improvements in detection techniques, the rising occurrence of wildfires demands creative solutions for prompt identification and effective control. This research investigates proactive methods for detecting and handling wildfires in the United States, utilizing Artificial Intelligence (AI), Machine Learning (ML), and 5G technology. The specific objective of this research covers proactive detection and prevention of wildfires using advanced technology; Active monitoring and mapping with remote sensing and signaling leveraging on 5G technology; and Advanced response mechanisms to wildfire using drones and IOT devices. This study was based on secondary data collected from government databases and analyzed using descriptive statistics. In addition, past publications were reviewed through content analysis, and narrative synthesis was used to present the observations from various studies. The results showed that developing new technology presents an opportunity to detect and manage wildfires proactively. Utilizing advanced technology could save lives and prevent significant economic losses caused by wildfires. Various methods, such as AI-enabled remote sensing and 5G-based active monitoring, can enhance proactive wildfire detection and management. In addition, super intelligent drones and IOT devices can be used for safer responses to wildfires. This forms the core of the recommendation to the fire Management Agencies and the government.


LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition

arXiv.org Artificial Intelligence

Despite the impressive capabilities of large language models (LLMs), their performance on information extraction tasks is still not entirely satisfactory. However, their remarkable rewriting capabilities and extensive world knowledge offer valuable insights to improve these tasks. In this paper, we propose $LLM-DA$, a novel data augmentation technique based on LLMs for the few-shot NER task. To overcome the limitations of existing data augmentation methods that compromise semantic integrity and address the uncertainty inherent in LLM-generated text, we leverage the distinctive characteristics of the NER task by augmenting the original data at both the contextual and entity levels. Our approach involves employing 14 contextual rewriting strategies, designing entity replacements of the same type, and incorporating noise injection to enhance robustness. Extensive experiments demonstrate the effectiveness of our approach in enhancing NER model performance with limited data. Furthermore, additional analyses provide further evidence supporting the assertion that the quality of the data we generate surpasses that of other existing methods.


Investigating the Histogram Loss in Regression

arXiv.org Machine Learning

It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is required for prediction. This additional modeling often comes with performance gain and the reasons behind the improvement are not fully known. This paper investigates a recent approach to regression, the Histogram Loss, which involves learning the conditional distribution of the target variable by minimizing the cross-entropy between a target distribution and a flexible histogram prediction. We design theoretical and empirical analyses to determine why and when this performance gain appears, and how different components of the loss contribute to it. Our results suggest that the benefits of learning distributions in this setup come from improvements in optimization rather than learning a better representation. We then demonstrate the viability of the Histogram Loss in common deep learning applications without a need for costly hyperparameter tuning.


Boosting of Thoughts: Trial-and-Error Problem Solving with Large Language Models

arXiv.org Artificial Intelligence

The reasoning performance of Large Language Models (LLMs) on a wide range of problems critically relies on chain-of-thought prompting, which involves providing a few chain of thought demonstrations as exemplars in prompts. Recent work, e.g., Tree of Thoughts, has pointed out the importance of exploration and self-evaluation in reasoning step selection for complex problem solving. In this paper, we present Boosting of Thoughts (BoT), an automated prompting framework for problem solving with LLMs by iteratively exploring and self-evaluating many trees of thoughts in order to acquire an ensemble of trial-and-error reasoning experiences, which will serve as a new form of prompting to solve the complex problem. Starting from a simple prompt without requiring examples, BoT iteratively explores and evaluates a large collection of reasoning steps, and more importantly, uses error analysis obtained from the LLM on them to explicitly revise prompting, which in turn enhances reasoning step generation, until a final answer is attained. Our experiments with GPT-4 and Llama2 across extensive complex mathematical problems demonstrate that BoT consistently achieves higher or comparable problem-solving rates than other advanced prompting approaches.


What to Do When Your Discrete Optimization Is the Size of a Neural Network?

arXiv.org Artificial Intelligence

Oftentimes, machine learning applications using neural networks involve solving discrete optimization problems, such as in pruning, parameter-isolation-based continual learning and training of binary networks. Still, these discrete problems are combinatorial in nature and are also not amenable to gradient-based optimization. Additionally, classical approaches used in discrete settings do not scale well to large neural networks, forcing scientists and empiricists to rely on alternative methods. Among these, two main distinct sources of top-down information can be used to lead the model to good solutions: (1) extrapolating gradient information from points outside of the solution set (2) comparing evaluations between members of a subset of the valid solutions. We take continuation path (CP) methods to represent using purely the former and Monte Carlo (MC) methods to represent the latter, while also noting that some hybrid methods combine the two. The main goal of this work is to compare both approaches. For that purpose, we first overview the two classes while also discussing some of their drawbacks analytically. Then, on the experimental section, we compare their performance, starting with smaller microworld experiments, which allow more fine-grained control of problem variables, and gradually moving towards larger problems, including neural network regression and neural network pruning for image classification, where we additionally compare against magnitude-based pruning.


Statistical and Machine Learning Models for Predicting Fire and Other Emergency Events

arXiv.org Artificial Intelligence

Emergency events in a city cause considerable economic loss to individuals, their families, and the community. Accurate and timely prediction of events can help the emergency fire and rescue services in preparing for and mitigating the consequences of emergency events. In this paper, we present a systematic development of predictive models for various types of emergency events in the City of Edmonton, Canada. We present methods for (i) data collection and dataset development; (ii) descriptive analysis of each event type and its characteristics at different spatiotemporal levels; (iii) feature analysis and selection based on correlation coefficient analysis and feature importance analysis; and (iv) development of prediction models for the likelihood of occurrence of each event type at different temporal and spatial resolutions. We analyze the association of event types with socioeconomic and demographic data at the neighborhood level, identify a set of predictors for each event type, and develop predictive models with negative binomial regression. We conduct evaluations at neighborhood and fire station service area levels. Our results show that the models perform well for most of the event types with acceptable prediction errors for weekly and monthly periods. The evaluation shows that the prediction accuracy is consistent at the level of the fire station, so the predictions can be used in management by fire rescue service departments for planning resource allocation for these time periods. We also examine the impact of the COVID-19 pandemic on the occurrence of events and on the accuracy of event predictor models. Our findings show that COVID-19 had a significant impact on the performance of the event prediction models.