Isahagian, Vatche
OptiSeq: Optimizing Example Ordering for In-Context Learning
Bhope, Rahul Atul, Venkateswaran, Praveen, Jayaram, K. R., Isahagian, Vatche, Muthusamy, Vinod, Venkatasubramanian, Nalini
A common approach to selecting examples at The use of in-context learning (ICL) with large inference-time is to generate embeddings of candidate language models (LLMs) has become a popular examples using a model like Sentence-BERT approach to achieve impressive performance in (Reimers, 2019) and retrieve the top-k most similar many NLP tasks (Raffel et al., 2020; Radford et al., examples for a given test instance, ranking them 2019). In ICL, models are prompted during inference based on distance or similarity. However, there is with task-specific examples that help condition a distinction between ranking examples (determining the generated output. Unlike fine-tuning, it how relevant they are to our test case) does not require updates to the model parameters, and ordering them (deciding how to arrange which offers many benefits with ever-increasing them in the prompt).
TaskDiff: A Similarity Metric for Task-Oriented Conversations
Bhaumik, Ankita, Venkateswaran, Praveen, Rizk, Yara, Isahagian, Vatche
The popularity of conversational digital assistants has resulted in the availability of large amounts of conversational data which can be utilized for improved user experience and personalized response generation. Building these assistants using popular large language models like ChatGPT also require additional emphasis on prompt engineering and evaluation methods. Textual similarity metrics are a key ingredient for such analysis and evaluations. While many similarity metrics have been proposed in the literature, they have not proven effective for task-oriented conversations as they do not take advantage of unique conversational features. To address this gap, we present TaskDiff, a novel conversational similarity metric that utilizes different dialogue components (utterances, intents, and slots) and their distributions to compute similarity. Extensive experimental evaluation of TaskDiff on a benchmark dataset demonstrates its superior performance and improved robustness over other related approaches.
DiSTRICT: Dialogue State Tracking with Retriever Driven In-Context Tuning
Venkateswaran, Praveen, Duesterwald, Evelyn, Isahagian, Vatche
Dialogue State Tracking (DST), a key component of task-oriented conversation systems, represents user intentions by determining the values of pre-defined slots in an ongoing dialogue. Existing approaches use hand-crafted templates and additional slot information to fine-tune and prompt large pre-trained language models and elicit slot values from the dialogue context. Significant manual effort and domain knowledge is required to design effective prompts, limiting the generalizability of these approaches to new domains and tasks. In this work, we propose DiSTRICT, a generalizable in-context tuning approach for DST that retrieves highly relevant training examples for a given dialogue to fine-tune the model without any hand-crafted templates. Experiments with the MultiWOZ benchmark datasets show that DiSTRICT outperforms existing approaches in various zero-shot and few-shot settings using a much smaller model, thereby providing an important advantage for real-world deployments that often have limited resource availability.
ProtoNER: Few shot Incremental Learning for Named Entity Recognition using Prototypical Networks
Kumar, Ritesh, Goyal, Saurabh, Verma, Ashish, Isahagian, Vatche
Key value pair (KVP) extraction or Named Entity Recognition(NER) from visually rich documents has been an active area of research in document understanding and data extraction domain. Several transformer based models such as LayoutLMv2, LayoutLMv3, and LiLT have emerged achieving state of the art results. However, addition of even a single new class to the existing model requires (a) re-annotation of entire training dataset to include this new class and (b) retraining the model again. Both of these issues really slow down the deployment of updated model. \\ We present \textbf{ProtoNER}: Prototypical Network based end-to-end KVP extraction model that allows addition of new classes to an existing model while requiring minimal number of newly annotated training samples. The key contributions of our model are: (1) No dependency on dataset used for initial training of the model, which alleviates the need to retain original training dataset for longer duration as well as data re-annotation which is very time consuming task, (2) No intermediate synthetic data generation which tends to add noise and results in model's performance degradation, and (3) Hybrid loss function which allows model to retain knowledge about older classes as well as learn about newly added classes.\\ Experimental results show that ProtoNER finetuned with just 30 samples is able to achieve similar results for the newly added classes as that of regular model finetuned with 2600 samples.
FedGen: Generalizable Federated Learning for Sequential Data
Venkateswaran, Praveen, Isahagian, Vatche, Muthusamy, Vinod, Venkatasubramanian, Nalini
Existing federated learning models that follow the standard risk minimization paradigm of machine learning often fail to generalize in the presence of spurious correlations in the training data. In many real-world distributed settings, spurious correlations exist due to biases and data sampling issues on distributed devices or clients that can erroneously influence models. Current generalization approaches are designed for centralized training and attempt to identify features that have an invariant causal relationship with the target, thereby reducing the effect of spurious features. However, such invariant risk minimization approaches rely on apriori knowledge of training data distributions which is hard to obtain in many applications. In this work, we present a generalizable federated learning framework called FedGen, which allows clients to identify and distinguish between spurious and invariant features in a collaborative manner without prior knowledge of training distributions. We evaluate our approach on real-world datasets from different domains and show that FedGen results in models that achieve significantly better generalization and can outperform the accuracy of current federated learning approaches by over 24%.
A Case for Business Process-Specific Foundation Models
Rizk, Yara, Venkateswaran, Praveen, Isahagian, Vatche, Muthusamy, Vinod
The inception of large language models has helped advance state-of-the-art performance on numerous natural language tasks. This has also opened the door for the development of foundation models for other domains and data modalities such as images, code, and music. In this paper, we argue that business process data representations have unique characteristics that warrant the development of a new class of foundation models to handle tasks like process mining, optimization, and decision making. These models should also tackle the unique challenges of applying AI to business processes which include data scarcity, multi-modal representations, domain specific terminology, and privacy concerns.
Extending LIME for Business Process Automation
Upadhyay, Sohini, Isahagian, Vatche, Muthusamy, Vinod, Rizk, Yara
AI business process applications automate high-stakes business decisions where there is an increasing demand to justify or explain the rationale behind algorithmic decisions. Business process applications have ordering or constraints on tasks and feature values that cause lightweight, model-agnostic, existing explanation methods like LIME to fail. In response, we propose a local explanation framework extending LIME for explaining AI business process applications. Empirical evaluation of our extension underscores the advantage of our approach in the business process setting.
From Robotic Process Automation to Intelligent Process Automation: Emerging Trends
Chakraborti, Tathagata, Isahagian, Vatche, Khalaf, Rania, Khazaeni, Yasaman, Muthusamy, Vinod, Rizk, Yara, Unuvar, Merve
In this survey, we study how recent advances in machine intelligence are disrupting the world of business processes. Over the last decade, there has been steady progress towards the automation of business processes under the umbrella of ``robotic process automation'' (RPA). However, we are currently at an inflection point in this evolution, as a new paradigm called ``Intelligent Process Automation'' (IPA) emerges, bringing machine learning (ML) and artificial intelligence (AI) technologies to bear in order to improve business process outcomes. The purpose of this paper is to provide a survey of this emerging theme and identify key open research challenges at the intersection of AI and business processes. We hope that this emerging theme will spark engaging conversations at the RPA Forum.
A Conversational Digital Assistant for Intelligent Process Automation
Rizk, Yara, Isahagian, Vatche, Boag, Scott, Khazaeni, Yasaman, Unuvar, Merve, Muthusamy, Vinod, Khalaf, Rania
Robotic process automation (RPA) has emerged as the leading approach to automate tasks in business processes. Moving away from back-end automation, RPA automated the mouse-click on user interfaces; this outside-in approach reduced the overhead of updating legacy software. However, its many shortcomings, namely its lack of accessibility to business users, have prevented its widespread adoption in highly regulated industries. In this work, we explore interactive automation in the form of a conversational digital assistant. It allows business users to interact with and customize their automation solutions through natural language. The framework, which creates such assistants, relies on a multi-agent orchestration model and conversational wrappers for autonomous agents including RPAs. We demonstrate the effectiveness of our proposed approach on a loan approval business process and a travel preapproval business process.