AITopics | Gupta, Swati

Collaborating Authors

Gupta, Swati

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Navigating the Social Welfare Frontier: Portfolios for Multi-objective Reinforcement Learning

Kim, Cheol Woo, Moondra, Jai, Verma, Shresth, Pollack, Madeleine, Kong, Lingkai, Tambe, Milind, Gupta, Swati

arXiv.org Artificial IntelligenceFeb-13-2025

In this paper, we study a reinforcement learning (RL) setting where a deployed policy impacts multiple stakeholders in different ways. Each stakeholder is associated with a unique reward function, and the goal is to train a policy that adequately aggregates their preferences. This setting, which is often modeled using multi-objective reinforcement learning (MORL), arises in many RL applications, such as fair resource allocation in healthcare [32], cloud computing [27, 18] and communication networks [36, 7]. Recently, with the rise of large language models (LLMs), reinforcement learning from human feedback (RLHF) techniques that reflect the preferences of heterogeneous individuals have also been explored [6, 38, 26]. Preference aggregation in such scenarios is often achieved by choosing a social welfare function, which takes the utilities of multiple stakeholders as input and outputs a scalar value representing the overall welfare [37, 9, 32, 13, 38, 26, 6]. However, selecting the appropriate social welfare function is a nontrivial task, as each function embodies a different notion of social welfare and can lead to vastly different outcomes for the involved stakeholders. In this work, we focus on a class of social welfare functions known as generalized p-means, a widely used class of social welfare functions in algorithmic fairness and social choice theory.

machine learning, portfolio, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2502.09724

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Transportation > Passenger (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Reducing the Filtering Effect in Public School Admissions: A Bias-aware Analysis for Targeted Interventions

Faenza, Yuri, Gupta, Swati, Vuorinen, Aapeli, Zhang, Xuan

arXiv.org Artificial IntelligenceJul-15-2024

Problem definition: Traditionally, New York City's top 8 public schools have selected candidates solely based on their scores in the Specialized High School Admissions Test (SHSAT). These scores are known to be impacted by socioeconomic status of students and test preparation received in middle schools, leading to a massive filtering effect in the education pipeline. The classical mechanisms for assigning students to schools do not naturally address problems like school segregation and class diversity, which have worsened over the years. The scientific community, including policymakers, have reacted by incorporating group-specific quotas and proportionality constraints, with mixed results. The problem of finding effective and fair methods for broadening access to top-notch education is still unsolved. Methodology/results: We take an operations approach to the problem different from most established literature, with the goal of increasing opportunities for students with high economic needs. Using data from the Department of Education (DOE) in New York City, we show that there is a shift in the distribution of scores obtained by students that the DOE classifies as "disadvantaged" (following criteria mostly based on economic factors). We model this shift as a "bias" that results from an underestimation of the true potential of disadvantaged students. We analyze the impact this bias has on an assortative matching market. We show that centrally planned interventions can significantly reduce the impact of bias through scholarships or training, when they target the segment of disadvantaged students with average performance.

artificial intelligence, mistreatment, student, (18 more...)

arXiv.org Artificial Intelligence

2004.10846

Country: North America > United States > New York (0.55)

Genre: Research Report (0.82)

Industry:

Education > Educational Setting > K-12 Education (0.87)
Education > Operations > Student Enrollment (0.72)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Walking in the Shadow: A New Perspective on Descent Directions for Constrained Minimization

Mortagy, Hassan, Gupta, Swati, Pokutta, Sebastian

arXiv.org Artificial IntelligenceAug-30-2023

Descent directions such as movement towards Descent directions, including movement towards Frank-Wolfe vertices, away-steps, in-face away-steps and pairwise directions, have been an important design consideration in conditional gradient descent (CGD) variants. In this work, we attempt to demystify the impact of the movement in these directions towards attaining constrained minimizers. The optimal local direction of descent is the directional derivative (i.e., shadow) of the projection of the negative gradient. We show that this direction is the best away-step possible, and the continuous-time dynamics of moving in the shadow is equivalent to the dynamics of projected gradient descent (PGD), although it's non-trivial to discretize. We also show that Frank-Wolfe (FW) vertices correspond to projecting onto the polytope using an "infinite" step in the direction of the negative gradient, thus providing a new perspective on these steps. We combine these insights into a novel Shadow-CG method that uses FW and shadow steps, while enjoying linear convergence, with a rate that depends on the number of breakpoints in its projection curve, rather than the pyramidal width. We provide a linear bound on the number of breakpoints for simple polytopes and present scaling-invariant upper bounds for general polytopes based on the number of facets. We exemplify the benefit of using Shadow-CG computationally for various applications, while raising an open question about tightening the bound on the number of breakpoints for general polytopes.

artificial intelligence, machine learning, projection curve, (18 more...)

arXiv.org Artificial Intelligence

2006.08426

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Mixed-Integer Projections for Automated Data Correction of EMRs Improve Predictions of Sepsis among Hospitalized Patients

Arora, Mehak, Mortagy, Hassan, Dwarshius, Nathan, Gupta, Swati, Holder, Andre L., Kamaleswaran, Rishikesan

arXiv.org Artificial IntelligenceAug-21-2023

Machine learning (ML) models are increasingly pivotal in automating clinical decisions. Yet, a glaring oversight in prior research has been the lack of proper processing of Electronic Medical Record (EMR) data in the clinical context for errors and outliers. Addressing this oversight, we introduce an innovative projections-based method that seamlessly integrates clinical expertise as domain constraints, generating important meta-data that can be used in ML workflows. In particular, by using high-dimensional mixed-integer programs that capture physiological and biological constraints on patient vitals and lab values, we can harness the power of mathematical "projections" for the EMR data to correct patient data. Consequently, we measure the distance of corrected data from the constraints defining a healthy range of patient data, resulting in a unique predictive metric we term as "trust-scores". These scores provide insight into the patient's health status and significantly boost the performance of ML classifiers in real-life clinical settings. We validate the impact of our framework in the context of early detection of sepsis using ML. We show an AUROC of 0.865 and a precision of 0.922, that surpasses conventional ML models without such projections.

artificial intelligence, constraint, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.10781

Country: North America > United States > Massachusetts (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.69)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

TACOS: Topology-Aware Collective Algorithm Synthesizer for Distributed Training

Won, William, Elavazhagan, Midhilesh, Srinivasan, Sudarshan, Durg, Ajaya, Gupta, Swati, Krishna, Tushar

arXiv.org Artificial IntelligenceApr-11-2023

Collective communications are an indispensable part of distributed training. Running a topology-aware collective algorithm is crucial for optimizing communication performance by minimizing congestion. Today such algorithms only exist for a small set of simple topologies, limiting the topologies employed in training clusters and handling irregular topologies due to network failures. In this paper, we propose TACOS, an automated topology-aware collective synthesizer for arbitrary input network topologies. TACOS synthesized 3.73x faster All-Reduce algorithm over baselines, and synthesized collective algorithms for 512-NPU system in just 6.1 minutes.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2304.05301

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > India > Karnataka (0.14)

Genre: Research Report (0.50)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.89)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

Artificial Intelligence/Operations Research Workshop 2 Report Out

Dickerson, John, Dilkina, Bistra, Ding, Yu, Gupta, Swati, Van Hentenryck, Pascal, Koenig, Sven, Krishnan, Ramayya, Kulkarni, Radhika, Gill, Catherine, Griffin, Haley, Hunter, Maddy, Schwartz, Ann

arXiv.org Artificial IntelligenceApr-10-2023

Artificial intelligence (AI) has received significant attention in recent years, primarily due to breakthroughs in game playing, computer vision, and natural language processing that captured the imagination of the scientific community and the public at large. Many businesses, industries, and academic disciplines are now contemplating the application of AI to their own challenges. The federal government in the US and other countries have also invested significantly in advancing AI research and created funding initiatives and programs to promote greater collaboration across multiple communities. Some of the investment examples in the US include the establishment of the National AI Initiative Office, the launch of the National AI Research Resource Task Force, and more recently, the establishment of the National AI Advisory Committee. In 2021 INFORMS and ACM SIGAI joined together with the Computing Community Consortium (CCC) to organize a series of three workshops. The objective for this workshop series is to explore ways to exploit the synergies of the AI and Operations Research (OR) communities to transform decision making.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2304.04677

Country:

North America > United States > California (0.29)
North America > United States > Texas > Travis County > Austin (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Law (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base Polytopes

Moondra, Jai, Mortagy, Hassan, Gupta, Swati

arXiv.org Artificial IntelligenceMar-10-2023

Optimization algorithms such as projected Newton's method, FISTA, mirror descent, and its variants enjoy near-optimal regret bounds and convergence rates, but suffer from a computational bottleneck of computing ``projections'' in potentially each iteration (e.g., $O(T^{1/2})$ regret of online mirror descent). On the other hand, conditional gradient variants solve a linear optimization in each iteration, but result in suboptimal rates (e.g., $O(T^{3/4})$ regret of online Frank-Wolfe). Motivated by this trade-off in runtime v/s convergence rates, we consider iterative projections of close-by points over widely-prevalent submodular base polytopes $B(f)$. We first give necessary and sufficient conditions for when two close points project to the same face of a polytope, and then show that points far away from the polytope project onto its vertices with high probability. We next use this theory and develop a toolkit to speed up the computation of iterative projections over submodular polytopes using both discrete and continuous perspectives. We subsequently adapt the away-step Frank-Wolfe algorithm to use this information and enable early termination. For the special case of cardinality-based submodular polytopes, we improve the runtime of computing certain Bregman projections by a factor of $\Omega(n/\log(n))$. Our theoretical results show orders of magnitude reduction in runtime in preliminary computational experiments.

artificial intelligence, machine learning, projection, (19 more...)

arXiv.org Artificial Intelligence

2106.11943

Country:

Asia > Japan (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Tree DNN: A Deep Container Network

Singh, Brijraj, Gupta, Swati, Das, Mayukh, Naidu, Praveen Doreswamy, Allur, Sharan Kumar

arXiv.org Artificial IntelligenceDec-7-2022

Multi-Task Learning (MTL) has shown its importance at user products for fast training, data efficiency, reduced overfitting etc. MTL achieves it by sharing the network parameters and training a network for multiple tasks simultaneously. However, MTL does not provide the solution, if each task needs training from a different dataset. In order to solve the stated problem, we have proposed an architecture named TreeDNN along with it's training methodology. TreeDNN helps in training the model with multiple datasets simultaneously, where each branch of the tree may need a different training dataset. We have shown in the results that TreeDNN provides competitive performance with the advantage of reduced ROM requirement for parameter storage and increased responsiveness of the system by loading only specific branch at inference time.

artificial intelligence, machine learning, treednn, (16 more...)

arXiv.org Artificial Intelligence

2212.03474

Country: Asia > India (0.16)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Taming Wild Price Fluctuations: Monotone Stochastic Convex Optimization with Bandit Feedback

Salem, Jad, Gupta, Swati, Kamble, Vijay

arXiv.org Artificial IntelligenceMar-16-2021

Prices generated by automated price experimentation algorithms often display wild fluctuations, leading to unfavorable customer perceptions and violations of individual fairness: e.g., the price seen by a customer can be significantly higher than what was seen by her predecessors, only to fall once again later. To address this concern, we propose demand learning under a monotonicity constraint on the sequence of prices, within the framework of stochastic convex optimization with bandit feedback. Our main contribution is the design of the first sublinear-regret algorithms for monotonic price experimentation for smooth and strongly concave revenue functions under noisy as well as noiseless bandit feedback. The monotonicity constraint presents a unique challenge: since any increase (or decrease) in the decision-levels is final, an algorithm needs to be cautious in its exploration to avoid over-shooting the optimum. At the same time, minimizing regret requires that progress be made towards the optimum at a sufficient pace. Balancing these two goals is particularly challenging under noisy feedback, where obtaining sufficiently accurate gradient estimates is expensive. Our key innovation is to utilize conservative gradient estimates to adaptively tailor the degree of caution to local gradient information, being aggressive far from the optimum and being increasingly cautious as the prices approach the optimum. Importantly, we show that our algorithms guarantee the same regret rates (up to logarithmic factors) as the best achievable rates of regret without the monotonicity requirement.

algorithm, artificial intelligence, big data, (18 more...)

arXiv.org Artificial Intelligence

2103.09287

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Multi-Threshold Attention U-Net (MTAU) based Model for Multimodal Brain Tumor Segmentation in MRI scans

Awasthi, Navchetan, Pardasani, Rohit, Gupta, Swati

arXiv.org Artificial IntelligenceJan-28-2021

Gliomas are one of the most frequent brain tumors and are classified into high grade and low grade gliomas. The segmentation of various regions such as tumor core, enhancing tumor etc. plays an important role in determining severity and prognosis. Here, we have developed a multi-threshold model based on attention U-Net for identification of various regions of the tumor in magnetic resonance imaging (MRI). We propose a multi-path segmentation and built three separate models for the different regions of interest. The proposed model achieved mean Dice Coefficient of 0.59, 0.72, and 0.61 for enhancing tumor, whole tumor and tumor core respectively on the training dataset. The same model gave mean Dice Coefficient of 0.57, 0.73, and 0.61 on the validation dataset and 0.59, 0.72, and 0.57 on the test dataset.

dataset, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

2101.12404

Country:

Asia > India (0.14)
North America > United States (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback