Goto

Collaborating Authors

 Dasgupta, Koustuv


Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling

arXiv.org Artificial Intelligence

We study the problem of automatically annotating relevant numerals (GAAP metrics) occurring in the financial documents with their corresponding XBRL tags. Different from prior works, we investigate the feasibility of solving this extreme classification problem using a generative paradigm through instruction tuning of Large Language Models (LLMs). To this end, we leverage metric metadata information to frame our target outputs while proposing a parameter efficient solution for the task using LoRA. We perform experiments on two recently released financial numeric labeling datasets. Our proposed model, FLAN-FinXC, achieves new state-of-the-art performances on both the datasets, outperforming several strong baselines. We explain the better scores of our proposed model by demonstrating its capability for zero-shot as well as the least frequently occurring tags. Also, even when we fail to predict the XBRL tags correctly, our generated output has substantial overlap with the ground-truth in majority of the cases.


FinRED: A Dataset for Relation Extraction in Financial Domain

arXiv.org Artificial Intelligence

Relation extraction models trained on a source domain cannot be applied on a different target domain due to the mismatch between relation sets. In the current literature, there is no extensive open-source relation extraction dataset specific to the finance domain. In this paper, we release FinRED, a relation extraction dataset curated from financial news and earning call transcripts containing relations from the finance domain. FinRED has been created by mapping Wikidata triplets using distance supervision method. We manually annotate the test data to ensure proper evaluation. We also experiment with various state-of-the-art relation extraction models on this dataset to create the benchmark. We see a significant drop in their performance on FinRED compared to the general relation extraction datasets which tells that we need better models for financial relation extraction.


CAPReS: Context Aware Persona Based Recommendation for Shoppers

AAAI Conferences

Nowadays, brick-and-mortar stores are finding it extremely difficult to retain their customers due to the ever increasing competition from the online stores. One of the key reasons for this is the lack of personalized shopping experience offered by the brick-and-mortar stores. This work considers the problem of persona based shopping recommendation for such stores to maximize the value for money of the shoppers. For this problem, it proposes a non-polynomial time-complexity optimal dynamic program and a polynomial time-complexity non-optimal heuristic, for making top-k recommendations by taking into account shopper persona and her time and budget constraints. In our empirical evaluations with a mix of real-world data and simulated data, the performance of the heuristic in terms of the persona based recommendations (quantified by similarity scores and items recommended) closely matched (differed by only 8% each with) that of the dynamic program and at the same time heuristic ran at least twice faster compared to the dynamic program.


PISCES: Participatory Incentive Strategies for Effective Community Engagement in Smart Cities

AAAI Conferences

A key challenge in participatory sensing systems has been the design of incentive mechanisms that motivate individuals to contribute data to consuming applications. Emerging trends in urban development and smart city planning indicate the use of citizen reports to gather insights and identify areas for transformation. Consumers of these reports (e.g. city agencies) typically associate non-uniform utility (or values) to different reports based on the spatio-temporal context of the reports. For example, a report indicating traffic congestion near an airport, in early morning hours, would tend to have much higher utility than a similar report from a sparse residential area. In such cases, the design of an incentive mechanism must motivate participants, via appropriate rewards (or payments), to provide higher utility reports when compared to less valued ones. The main challenge in designing such an incentive scheme is two-fold: (i) lack of prior knowledge of participants in terms of their availability (i.e. who are in the vicinity) and reporting behaviour (i.e. what are the rewards expected); and (ii) minimizing payments to the reporters while ensuring that the desired number of reports are collected. In this paper, we propose STOC-PISCES, an algorithm that guarantees a stochastic optimal solution in the generalized setting of an unknown set of participants, with non-deterministic availabilities and stochastically rational reporting behaviour. The superior performance of STOC-PISCES in experimental settings, based on real-world data, endorses its adoption as an incentive strategy in participatory sensing applications like smart city management.


Adaptive Performance Optimization over Crowd Labor Channels

AAAI Conferences

We describe a system which monitors the performance of labor channels within a crowdsourcing platform in an online manner. This allows us to automatically determine if and when to switch between labor channels in order to improve overall performance of crowd tasks.


Post It or Not: Viewership Based Posting of Crowdsourced Tasks

AAAI Conferences

We propose an online scheduling algorithm for posting crowdsourcing tasks which maximizes a novel metric called task viewership. This metric is computed using stochastic model based on coverage process and it measures the likelihood that a task is viewed by multiple crowd workers, which is correlated to the likelihood that it will be selected and completed.


TRACCS: A Framework for Trajectory-Aware Coordinated Urban Crowd-Sourcing

AAAI Conferences

We investigate the problem of large-scale mobile crowd-tasking, where a large pool of citizen crowd-workers are used to perform a variety of location-specific urban logistics tasks. Current approaches to such mobile crowd-tasking are very decentralized: a crowd-tasking platform usually provides each worker a set of available tasks close to the worker's current location; each worker then independently chooses which tasks she wants to accept and perform. In contrast, we propose TRACCS, a more coordinated task assignment approach, where the crowd-tasking platform assigns a sequence of tasks to each worker, taking into account their expected location trajectory over a wider time horizon, as opposed to just instantaneous location. We formulate such task assignment as an optimization problem, that seeks to maximize the total payoff from all assigned tasks, subject to a maximum bound on the detour (from the expected path) that a worker will experience to complete her assigned tasks. We develop credible computationally-efficient heuristics to address this optimization problem (whose exact solution requires solving a complex integer linear program), and show, via simulations with realistic topologies and commuting patterns, that a specific heuristic (called Greedy-ILS) increases the fraction of assigned tasks by more than 20%, and reduces the average detour overhead by more than 60%, compared to the current decentralized approach.


CrowdUtility: A Recommendation System for Crowdsourcing Platforms

AAAI Conferences

Crowd workers exhibit varying work patterns, expertise, and quality leading to wide variability in the performance of crowdsourcing platforms. The onus of choosing a suitable platform to post tasks is mostly with the requester, often leading to poor guarantees and unmet requirements due to the dynamism in performance of crowd platforms. Towards this end, we demonstrate CrowdUtility, a statistical modelling based tool for evaluating multiple crowdsourcing platforms and recommending a platform that best suits the requirements of the requester. CrowdUtility uses an online Multi-Armed Bandit framework, to schedule tasks while optimizing platform performance. We demonstrate an end-to end system starting from requirements specification, to platform recommendation, to real-time monitoring.