AITopics | Calgary

Collaborating Authors

Calgary

Evaluating Dependencies in Fact Editing for Language Models: Specificity and Implication Awareness

Li, Zichao, Arous, Ines, Reddy, Siva, Cheung, Jackie C. K.

arXiv.org Artificial IntelligenceDec-4-2023

The potential of using a large language model (LLM) as a knowledge base (KB) has sparked significant interest. To manage the knowledge acquired by LLMs, we need to ensure that the editing of learned facts respects internal logical constraints, which are known as dependency of knowledge. Existing work on editing LLMs has partially addressed the issue of dependency, when the editing of a fact should apply to its lexical variations without disrupting irrelevant ones. However, they neglect the dependency between a fact and its logical implications. We propose an evaluation protocol with an accompanying question-answering dataset, DepEdit, that provides a comprehensive assessment of the editing process considering the above notions of dependency. Our protocol involves setting up a controlled environment in which we edit facts and monitor their impact on LLMs, along with their implications based on If-Then rules. Extensive experiments on DepEdit show that existing knowledge editing methods are sensitive to the surface form of knowledge, and that they have limited performance in inferring the implications of edited facts.

editing, implication, knowledge, (17 more...)

arXiv.org Artificial Intelligence

2312.01858

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.05)
North America > United States > New York (0.04)
North America > United States > Indiana > Johnson County > Franklin (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

End-User Puppeteering of Expressive Movements

Wang, Hongyu, Martelaro, Nikolas

arXiv.org Artificial IntelligenceDec-3-2023

The end-user programming of social robot behavior is usually limited by a predefined set of movements. We are proposing a puppeteering robotic interface that provides a more intuitive method of programming robot expressive movements. As the user manipulates the puppet of a robot, the actual robot replicates the movements, providing real-time visual feedback. Through this proposed interface, even with limited training, a novice user can design and program expressive movements efficiently. We present our preliminary user study results in this extended abstract.

expressive movement, participant, robot, (14 more...)

arXiv.org Artificial Intelligence

2207.12544

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
Asia > Japan (0.04)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.71)

Technology: Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.37)

Add feedback

Local monotone operator learning using non-monotone operators: MnM-MOL

John, Maneesh, Chand, Jyothi Rikhab, Jacob, Mathews

arXiv.org Artificial IntelligenceDec-1-2023

The recovery of magnetic resonance (MR) images from undersampled measurements is a key problem that has seen extensive research in recent years. Unrolled approaches, which rely on end-to-end training of convolutional neural network (CNN) blocks within iterative reconstruction algorithms, offer state-of-the-art performance. These algorithms require a large amount of memory during training, making them difficult to employ in high-dimensional applications. Deep equilibrium (DEQ) models and the recent monotone operator learning (MOL) approach were introduced to eliminate the need for unrolling, thus reducing the memory demand during training. Both approaches require a Lipschitz constraint on the network to ensure that the forward and backpropagation iterations converge. Unfortunately, the constraint often results in reduced performance compared to unrolled methods. The main focus of this work is to relax the constraint on the CNN block in two different ways. Inspired by convex-non-convex regularization strategies, we now impose the monotone constraint on the sum of the gradient of the data term and the CNN block, rather than constrain the CNN itself to be a monotone operator. This approach enables the CNN to learn possibly non-monotone score functions, which can translate to improved performance. In addition, we only restrict the operator to be monotone in a local neighborhood around the image manifold. Our theoretical results show that the proposed algorithm is guaranteed to converge to the fixed point and that the solution is robust to input perturbations, provided that it is initialized close to the true solution. Our empirical results show that the relaxed constraints translate to improved performance and that the approach enjoys robustness to input perturbations similar to MOL.

algorithm, operator, perturbation, (14 more...)

arXiv.org Artificial Intelligence

2312.00386

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)

Genre: Research Report > New Finding (0.54)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Health & Medicine > Health Care Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Addressing Membership Inference Attack in Federated Learning with Model Compression

Németh, Gergely Dániel, Lozano, Miguel Ángel, Quadrianto, Novi, Oliver, Nuria

arXiv.org Artificial IntelligenceNov-29-2023

Federated Learning (FL) has been proposed as a privacy-preserving solution for machine learning. However, recent works have shown that Federated Learning can leak private client data through membership attacks. In this paper, we show that the effectiveness of these attacks on the clients negatively correlates with the size of the client datasets and model complexity. Based on this finding, we propose model-agnostic Federated Learning as a privacy-enhancing solution because it enables the use of models of varying complexity in the clients. To this end, we present $\texttt{MaPP-FL}$, a novel privacy-aware FL approach that leverages model compression on the clients while keeping a full model on the server. We compare the performance of $\texttt{MaPP-FL}$ against state-of-the-art model-agnostic FL methods on the CIFAR-10, CIFAR-100, and FEMNIST vision datasets. Our experiments show the effectiveness of $\texttt{MaPP-FL}$ in preserving the clients' and the server's privacy while achieving competitive classification accuracies.

dataset, mapp-fl, server, (13 more...)

arXiv.org Artificial Intelligence

2311.1775

Country:

Europe > Spain > Valencian Community > Alicante Province > Alicante (0.04)
North America > United States > Virginia (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)

Add feedback

Increasing Coverage and Precision of Textual Information in Multilingual Knowledge Graphs

Conia, Simone, Li, Min, Lee, Daniel, Minhas, Umar Farooq, Ilyas, Ihab, Li, Yunyao

arXiv.org Artificial IntelligenceNov-27-2023

Recent work in Natural Language Processing and Computer Vision has been using textual information -- e.g., entity names and descriptions -- available in knowledge graphs to ground neural models to high-quality structured data. However, when it comes to non-English languages, the quantity and quality of textual information are comparatively scarce. To address this issue, we introduce the novel task of automatic Knowledge Graph Enhancement (KGE) and perform a thorough investigation on bridging the gap in both the quantity and quality of textual information between English and non-English languages. More specifically, we: i) bring to light the problem of increasing multilingual coverage and precision of entity names and descriptions in Wikidata; ii) demonstrate that state-of-the-art methods, namely, Machine Translation (MT), Web Search (WS), and Large Language Models (LLMs), struggle with this task; iii) present M-NTA, a novel unsupervised approach that combines MT, WS, and LLMs to generate high-quality textual information; and, iv) study the impact of increasing multilingual coverage and precision of non-English textual information in Entity Linking, Knowledge Graph Completion, and Question Answering. As part of our effort towards better multilingual knowledge graphs, we also introduce WikiKGE-10, the first human-curated benchmark to evaluate KGE approaches in 10 languages across 7 language families.

entity name, information, knowledge graph, (14 more...)

arXiv.org Artificial Intelligence

2311.15781

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(15 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.48)

Industry: Leisure & Entertainment > Sports (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

ChatGPT Application In Summarizing An Evolution Of Deep Learning Techniques In Imaging: A Qualitative Study

Sarraf, Arman, Abbaspour, Amirabbas

arXiv.org Artificial IntelligenceNov-26-2023

Text summarization is a pivotal application of NLP that condenses lengthy documents or articles into shorter, coherent representations while retaining the essential information. Through various algorithms and techniques, NLP models identify significant sentences, key phrases, or essential concepts within the text to generate concise summaries. Extractive summarization involves selecting and stitching together important segments directly from the original text, often based on relevance, importance, or frequency of occurrence. On the other hand, abstractive summarization goes beyond extraction, generating novel sentences that convey the core meaning while potentially rephrasing and restructuring the content. NLP-powered summarization systems play a crucial role in information retrieval, aiding in quick comprehension and accessibility of vast amounts of text across diverse domains such as news articles, research papers, and legal documents. ChatGPT boasts impressive text summarization capabilities, harnessing its advanced Natural Language Processing (NLP) architecture to distill lengthy conversations, articles, or documents into concise, coherent summaries. Leveraging its vast understanding of language semantics, context, and syntax, ChatGPT effectively identifies key points, essential information, and significant passages within the text. Its summarization prowess encompasses extractive and abstractive techniques, allowing it to select important segments directly from the input while generating novel, coherent sentences that capture the essence of the content.

alzheimer, application, classification, (15 more...)

arXiv.org Artificial Intelligence

2312.03723

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.28)
North America > United States > Arkansas (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spectro-ViT: A Vision Transformer Model for GABA-edited MRS Reconstruction Using Spectrograms

Dias, Gabriel, Berto, Rodrigo Pommot, Oliveira, Mateus, Ueda, Lucas, Dertkigil, Sergio, Costa, Paula D. P., Shamaei, Amirmohammad, Souza, Roberto, Harris, Ashley, Rittner, Leticia

arXiv.org Artificial IntelligenceNov-26-2023

Purpose: To investigate the use of a Vision Transformer (ViT) to reconstruct/denoise GABA-edited magnetic resonance spectroscopy (MRS) from a quarter of the typically acquired number of transients using spectrograms. Theory and Methods: A quarter of the typically acquired number of transients collected in GABA-edited MRS scans are pre-processed and converted to a spectrogram image representation using the Short-Time Fourier Transform (STFT). The image representation of the data allows the adaptation of a pre-trained ViT for reconstructing GABA-edited MRS spectra (Spectro-ViT). The Spectro-ViT is fine-tuned and then tested using \textit{in vivo} GABA-edited MRS data. The Spectro-ViT performance is compared against other models in the literature using spectral quality metrics and estimated metabolite concentration values. Results: The Spectro-ViT model significantly outperformed all other models in four out of five quantitative metrics (mean squared error, shape score, GABA+/water fit error, and full width at half maximum). The metabolite concentrations estimated (GABA+/water, GABA+/Cr, and Glx/water) were consistent with the metabolite concentrations estimated using typical GABA-edited MRS scans reconstructed with the full amount of typically collected transients. Conclusion: The proposed Spectro-ViT model achieved state-of-the-art results in reconstructing GABA-edited MRS, and the results indicate these scans could be up to four times faster.

magnetic resonance, pipeline, resonance, (13 more...)

arXiv.org Artificial Intelligence

2311.15386

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.15)
South America > Brazil > São Paulo > Campinas (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.94)
Health & Medicine > Diagnostic Medicine > Imaging (0.94)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Directional Privacy for Deep Learning

Faustini, Pedro, Fernandes, Natasha, Tonni, Shakila, McIver, Annabelle, Dras, Mark

arXiv.org Artificial IntelligenceNov-26-2023

Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. It applies isotropic Gaussian noise to gradients during training, which can perturb these gradients in any direction, damaging utility. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable for preserving utility. In this paper, we apply \textit{directional privacy}, via a mechanism based on the von Mises-Fisher (VMF) distribution, to perturb gradients in terms of \textit{angular distance} so that gradient direction is broadly preserved. We show that this provides both $\epsilon$-DP and $\epsilon d$-privacy for deep learning training, rather than the $(\epsilon, \delta)$-privacy of the Gaussian mechanism. Experiments on key datasets then indicate that the VMF mechanism can outperform the Gaussian in the utility-privacy trade-off. In particular, our experiments provide a direct empirical comparison of privacy between the two approaches in terms of their ability to defend against reconstruction and membership inference.

gradient, mechanism, privacy, (17 more...)

arXiv.org Artificial Intelligence

2211.04686

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.34)

Industry: Information Technology > Security & Privacy (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters

Tesch, Kristina, Gerkmann, Timo

arXiv.org Artificial IntelligenceNov-21-2023

In a multi-channel separation task with multiple speakers, we aim to recover all individual speech signals from the mixture. In contrast to single-channel approaches, which rely on the different spectro-temporal characteristics of the speech signals, multi-channel approaches should additionally utilize the different spatial locations of the sources for a more powerful separation especially when the number of sources increases. To enhance the spatial processing in a multi-channel source separation scenario, in this work, we propose a deep neural network (DNN) based spatially selective filter (SSF) that can be spatially steered to extract the speaker of interest by initializing a recurrent neural network layer with the target direction. We compare the proposed SSF with a common end-to-end direct separation (DS) approach trained using utterance-wise permutation invariant training (PIT), which only implicitly learns to perform spatial filtering. We show that the SSF has a clear advantage over a DS approach with the same underlying network architecture when there are more than two speakers in the mixture, which can be attributed to a better use of the spatial information. Furthermore, we find that the SSF generalizes much better to additional noise sources that were not seen during training and to scenarios with speakers positioned at a similar angle.

architecture, information, separation, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TASLP.2023.3334101

2304.12023

Country:

Europe > Germany > Hamburg (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Approximating Two-Layer Feedforward Networks for Efficient Transformers

Csordás, Róbert, Irie, Kazuki, Schmidhuber, Jürgen

arXiv.org Artificial IntelligenceNov-21-2023

How to reduce compute and memory requirements of neural networks (NNs) without sacrificing performance? Many recent works use sparse Mixtures of Experts (MoEs) to build resource-efficient large language models (LMs). Here we introduce several novel perspectives on MoEs, presenting a general framework that unifies various methods to approximate two-layer NNs (e.g., feedforward blocks of Transformers), including product-key memories (PKMs). Leveraging insights from this framework, we propose methods to improve both MoEs and PKMs. Unlike prior work that compares MoEs with dense baselines under the compute-equal condition, our evaluation condition is parameter-equal, which is crucial to properly evaluate LMs. We show that our MoEs are competitive with the dense Transformer-XL on both the WikiText-103 and enwiki8 datasets at two different scales, while being much more resource efficient. This demonstrates that MoEs are relevant not only to extremely large LMs but also to any-scale resource-efficient LMs. Our code is public.

renorm, softmax, transformer, (17 more...)

arXiv.org Artificial Intelligence

2310.10837

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(12 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback