AITopics | Sukumaran, Rohan

Collaborating Authors

Sukumaran, Rohan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FairLoRA: Unpacking Bias Mitigation in Vision Models with Fairness-Driven Low-Rank Adaptation

Sukumaran, Rohan, Feizi, Aarash, Romero-Sorian, Adriana, Farnadi, Golnoosh

arXiv.org Artificial IntelligenceOct-22-2024

Recent advances in parameter-efficient fine-tuning methods, such as Low Rank Adaptation (LoRA), have gained significant attention for their ability to efficiently adapt large foundational models to various downstream tasks. These methods are appreciated for achieving performance comparable to full fine-tuning on aggregate-level metrics, while significantly reducing computational costs. To systematically address fairness in LLMs previous studies fine-tune on fairness specific data using a larger LoRA rank than typically used. In this paper, we introduce FairLoRA, a novel fairness-specific regularizer for LoRA aimed at reducing performance disparities across data subgroups by minimizing per-class variance in loss. To the best of our knowledge, we are the first to introduce a fairness based finetuning through LoRA. Our results demonstrate that the need for higher ranks to mitigate bias is not universal; it depends on factors such as the pre-trained model, dataset, and task. More importantly, we systematically evaluate FairLoRA across various vision models, including ViT, DiNO, and CLIP, in scenarios involving distribution shifts. We further emphasize the necessity of using multiple fairness metrics to obtain a holistic assessment of fairness, rather than relying solely on the metric optimized during training.

fairlora, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.17358

Country: North America > Canada > Quebec (0.29)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Balancing Act: Constraining Disparate Impact in Sparse Models

Hashemizadeh, Meraj, Ramirez, Juan, Sukumaran, Rohan, Farnadi, Golnoosh, Lacoste-Julien, Simon, Gallego-Posada, Jose

arXiv.org Artificial IntelligenceOct-31-2023

Model pruning is a popular approach to enable the deployment of large deep learning models on edge devices with restricted computational or storage capacities. Although sparse models achieve performance comparable to that of their dense counterparts at the level of the entire dataset, they exhibit high accuracy drops for some data sub-groups. Existing methods to mitigate this disparate impact induced by pruning (i) rely on surrogate metrics that address the problem indirectly and have limited interpretability; or (ii) scale poorly with the number of protected sub-groups in terms of computational cost. We propose a constrained optimization approach that directly addresses the disparate impact of pruning: our formulation bounds the accuracy change between the dense and sparse models, for each subgroup. This choice of constraints provides an interpretable success criterion to determine if a pruned model achieves acceptable disparity levels. Experimental results demonstrate that our technique scales reliably to problems involving large models and hundreds of protected sub-groups. Current deep learning practice displays a trend towards larger architectures (Bommasani et al., 2021), as exemplified by popular models such as GPT-4 (OpenAI, 2023), Llama 2 (Touvron et al., 2023) and DALL-E 2 (Ramesh et al., 2022). Model compression techniques such as pruning (Gale et al., 2019), knowledge distillation (Hinton et al., 2015), or quantization (Gholami et al., 2021) are crucial towards enabling the deployment of large models across a wide range of platforms, including resource-constrained edge devices like smartphones. Despite achieving comparable performance at an aggregate level over the entire dataset, pruned models often exhibit significant accuracy reduction for some data sub-groups (Hooker et al., 2019; 2020; Paganini, 2020). In particular, under-represented groups can suffer high performance degradation while the overall performance remains unaffected, thus exacerbating systemic biases in machine learning models. Tran et al. (2022) refer to this phenomenon as the disparate impact of pruning. Existing mitigation methods face challenges in terms of interpretability and scalability to a large number of sub-groups. Tran et al. (2022) introduce constraints aiming to equalize the loss of the sparse model across sub-groups. However, their approach does not account for the unequal grouplevel performance of the dense model. Moreover, while the loss can be a useful surrogate for training, this method addresses the disparate impact issue indirectly as it focuses on controlling the loss, rather than group-level changes in accuracy. Alternatively, Lin et al. (2022) compute per-group importance scores for every model parameter to determine the weights to be pruned. This approach becomes prohibitively expensive when the model or the number of sub-groups is large.

disparate impact, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2310.20673

Country:

North America > Canada > Quebec (0.28)
North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

Add feedback

Omega: Optimistic EMA Gradients

Ramirez, Juan, Sukumaran, Rohan, Bertrand, Quentin, Gidel, Gauthier

arXiv.org Artificial IntelligenceJun-13-2023

Stochastic min-max optimization has gained interest in the machine learning community with the advancements in GANs and adversarial training. Although game optimization is fairly well understood in the deterministic setting, some issues persist in the stochastic regime. Recent work has shown that stochastic gradient descent-ascent methods such as the optimistic gradient are highly sensitive to noise or can fail to converge. Although alternative strategies exist, they can be prohibitively expensive. We introduce Omega, a method with optimistic-like updates that mitigates the impact of noise by incorporating an EMA of historic gradients in its update rule. We also explore a variation of this algorithm that incorporates momentum. Although we do not provide convergence guarantees, our experiments on stochastic games show that Omega outperforms the optimistic gradient method when applied to linear players.

artificial intelligence, machine learning, omega, (18 more...)

arXiv.org Artificial Intelligence

2306.07905

Country:

North America > Canada (0.29)
North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Offense Detection in Dravidian Languages using Code-Mixing Index based Focal Loss

Tula, Debapriya, MS, Shreyas, Reddy, Viswanatha, Sahu, Pranjal, Doddapaneni, Sumanth, Potluri, Prathyush, Sukumaran, Rohan, Patwa, Parth

arXiv.org Artificial IntelligenceNov-12-2021

Over the past decade, we have seen exponential growth in online content fueled by social media platforms. Data generation of this scale comes with the caveat of insurmountable offensive content in it. The complexity of identifying offensive content is exacerbated by the usage of multiple modalities (image, language, etc.), code mixed language and more. Moreover, even if we carefully sample and annotate offensive content, there will always exist significant class imbalance in offensive vs non offensive content. In this paper, we introduce a novel Code-Mixing Index (CMI) based focal loss which circumvents two challenges (1) code mixing in languages (2) class imbalance problem for Dravidian language offense detection. We also replace the conventional dot product-based classifier with the cosine-based classifier which results in a boost in performance. Further, we use multilingual models that help transfer characteristics learnt across languages to work effectively with low resourced languages. It is also important to note that our model handles instances of mixed script (say usage of Latin and Dravidian - Tamil script) as well. Our model can handle offensive language detection in a low-resource, class imbalanced, multilingual and code mixed setting.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2111.06916

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (0.64)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.47)
Media > Film (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Hostility Detection and Covid-19 Fake News Detection in Social Media

Gupta, Ayush, Sukumaran, Rohan, John, Kevin, Teki, Sundeep

arXiv.org Artificial IntelligenceJan-14-2021

Withtheadventofsocialmedia,therehasbeenanextremely rapid increase in the content shared online. Consequently, the propagation of fake news and hostile messages on social media platforms has also skyrocketed. In this paper, we address the problem of detecting hostile and fake content in the Devanagari (Hindi) script as a multi-class, multi-label problem. Using NLP techniques, we build a model that makes use of an abusive language detector coupled with features extracted via Hindi BERT and Hindi FastText models and metadata. Our model achieves a 0.97 F1 score on coarse grain evaluation on Hostility detection task. Additionally, we built models to identify fake news related to Covid-19 in English tweets. We leverage entity information extracted from the tweets along with textual representations learned from word embeddings and achieve a 0.93 F1 score on the English fake news detection task.

deep learning, immunology, tweet, (19 more...)

arXiv.org Artificial Intelligence

2101.05953

Country: Asia (0.14)

Genre: Research Report (0.64)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.63)
Health & Medicine > Therapeutic Area > Immunology (0.63)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback