AITopics | Yousefi, Niloofar

Collaborating Authors

Yousefi, Niloofar

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Predicting Through Generation: Why Generation Is Better for Prediction

Kowsher, Md, Prottasha, Nusrat Jahan, Bhat, Prakash, Yu, Chun-Nam, Soltanalian, Mojtaba, Garibay, Ivan, Garibay, Ozlem, Chen, Chen, Yousefi, Niloofar

arXiv.org Artificial IntelligenceFeb-24-2025

This paper argues that generating output tokens is more effective than using pooled representations for prediction tasks because token-level generation retains more mutual information. Since LLMs are trained on massive text corpora using next-token prediction, generation aligns naturally with their learned behavior. Using the Data Processing Inequality (DPI), we provide both theoretical and empirical evidence supporting this claim. However, autoregressive models face two key challenges when used for prediction: (1) exposure bias, where the model sees ground truth tokens during training but relies on its own predictions during inference, leading to errors, and (2) format mismatch, where discrete tokens do not always align with the tasks required output structure. To address these challenges, we introduce PredGen(Predicting Through Generating), an end to end framework that (i) uses scheduled sampling to reduce exposure bias, and (ii) introduces a task adapter to convert the generated tokens into structured outputs. Additionally, we introduce Writer-Director Alignment Loss (WDAL), which ensures consistency between token generation and final task predictions, improving both text coherence and numerical accuracy. We evaluate PredGen on multiple classification and regression benchmarks. Our results show that PredGen consistently outperforms standard baselines, demonstrating its effectiveness in structured prediction tasks.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.17817

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Media > Film (0.34)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

BnTTS: Few-Shot Speaker Adaptation in Low-Resource Setting

Basher, Mohammad Jahid Ibna, Kowsher, Md, Islam, Md Saiful, Nandi, Rabindra Nath, Prottasha, Nusrat Jahan, Menon, Mehadi Hasan, Muntasir, Tareq Al, Chowdhury, Shammur Absar, Alam, Firoj, Yousefi, Niloofar, Garibay, Ozlem Ozmen

arXiv.org Artificial IntelligenceFeb-8-2025

This paper introduces BnTTS (Bangla Text-To-Speech), the first framework for Bangla speaker adaptation-based TTS, designed to bridge the gap in Bangla speech synthesis using minimal training data. Building upon the XTTS architecture, our approach integrates Bangla into a multilingual TTS pipeline, with modifications to account for the phonetic and linguistic characteristics of the language. We pre-train BnTTS on 3.85k hours of Bangla speech dataset with corresponding text labels and evaluate performance in both zero-shot and few-shot settings on our proposed test dataset. Empirical evaluations in few-shot settings show that BnTTS significantly improves the naturalness, intelligibility, and speaker fidelity of synthesized Bangla speech. Compared to state-of-the-art Bangla TTS systems, BnTTS exhibits superior performance in Subjective Mean Opinion Score (SMOS), Naturalness, and Clarity metrics.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2502.05729

Country:

Asia > Singapore (0.14)
North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.93)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates

Kowsher, Md, Esmaeilbeig, Tara, Yu, Chun-Nam, Soltanalian, Mojtaba, Yousefi, Niloofar

arXiv.org Artificial IntelligenceOct-15-2024

We propose RoCoFT, a parameter-efficient fine-tuning method for large-scale language models (LMs) based on updating only a few rows and columns of the weight matrices in transformers. Through extensive experiments with medium-size LMs like BERT and RoBERTa, and larger LMs like Bloom-7B, Llama2-7B, and Llama2-13B, we show that our method gives comparable or better accuracies than state-of-art PEFT methods while also being more memory and computation-efficient. We also study the reason behind the effectiveness of our method with tools from neural tangent kernel theory. We empirically demonstrate that our kernel, constructed using a restricted set of row and column parameters, are numerically close to the full-parameter kernel and gives comparable classification performance. Ablation studies are conducted to investigate the impact of different algorithmic choices, including the selection strategy for rows and columns as well as the optimal rank for effective implementation of our method.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.10075

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

LLM-Mixer: Multiscale Mixing in LLMs for Time Series Forecasting

Kowsher, Md, Sobuj, Md. Shohanur Islam, Prottasha, Nusrat Jahan, Alanis, E. Alejandro, Garibay, Ozlem Ozmen, Yousefi, Niloofar

arXiv.org Artificial IntelligenceOct-15-2024

Time series forecasting remains a challenging task, particularly in the context of complex multiscale temporal patterns. This study presents LLM-Mixer, a framework that improves forecasting accuracy through the combination of multiscale time-series decomposition with pre-trained LLMs (Large Language Models). LLM-Mixer captures both short-term fluctuations and long-term trends by decomposing the data into multiple temporal resolutions and processing them with a frozen LLM, guided by a textual prompt specifically designed for time-series data. Extensive experiments conducted on multivariate and univariate datasets demonstrate that LLM-Mixer achieves competitive performance, outperforming recent state-of-the-art models across various forecasting horizons. This work highlights the potential of combining multiscale analysis and LLMs for effective and scalable time-series forecasting.

forecasting, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.11674

Genre: Research Report > New Finding (0.67)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Parameter-Efficient Fine-Tuning of Large Language Models using Semantic Knowledge Tuning

Prottasha, Nusrat Jahan, Mahmud, Asif, Sobuj, Md. Shohanur Islam, Bhat, Prakash, Kowsher, Md, Yousefi, Niloofar, Garibay, Ozlem Ozmen

arXiv.org Artificial IntelligenceOct-11-2024

Large Language Models (LLMs) are gaining significant popularity in recent years for specialized tasks using prompts due to their low computational cost. Standard methods like prefix tuning utilize special, modifiable tokens that lack semantic meaning and require extensive training for best performance, often falling short. In this context, we propose a novel method called Semantic Knowledge Tuning (SK-Tuning) for prompt and prefix tuning that employs meaningful words instead of random tokens. This method involves using a fixed LLM to understand and process the semantic content of the prompt through zero-shot capabilities. Following this, it integrates the processed prompt with the input text to improve the model's performance on particular tasks. Our experimental results show that SK-Tuning exhibits faster training times, fewer parameters, and superior performance on tasks such as text classification and understanding compared to other tuning methods. This approach offers a promising method for optimizing the efficiency and effectiveness of LLMs in processing language tasks.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.08598

Country:

Europe (0.92)
North America > United States (0.67)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FragXsiteDTI: Revealing Responsible Segments in Drug-Target Interaction with Transformer-Driven Interpretation

Yalabadi, Ali Khodabandeh, Yazdani-Jahromi, Mehdi, Yousefi, Niloofar, Tayebi, Aida, Abdidizaji, Sina, Garibay, Ozlem Ozmen

arXiv.org Artificial IntelligenceNov-4-2023

Drug-Target Interaction (DTI) prediction is vital for drug discovery, yet challenges persist in achieving model interpretability and optimizing performance. We propose a novel transformer-based model, FragXsiteDTI, that aims to address these challenges in DTI prediction. Notably, FragXsiteDTI is the first DTI model to simultaneously leverage drug molecule fragments and protein pockets. Our information-rich representations for both proteins and drugs offer a detailed perspective on their interaction. Inspired by the Perceiver IO framework, our model features a learnable latent array, initially interacting with protein binding site embeddings using cross-attention and later refined through self-attention and used as a query to the drug fragments in the drug's cross-attention transformer block. This learnable query array serves as a mediator and enables seamless information translation, preserving critical nuances in drug-protein interactions. Our computational results on three benchmarking datasets demonstrate the superior predictive power of our model over several state-of-the-art models. We also show the interpretability of our model in terms of the critical components of both target proteins and drug molecules within drug-target pairs.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2311.02326

Country: North America > United States > Florida > Orange County > Orlando (0.15)

Genre: Research Report > Promising Solution (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepFork: Supervised Prediction of Information Diffusion in GitHub

Akula, Ramya, Yousefi, Niloofar, Garibay, Ivan

arXiv.org Artificial IntelligenceOct-17-2019

Information spreads on complex social networks extremely fast, in other words, a piece of information can go viral within no time. Often it is hard to barricade this diffusion prior to the significant occurrence of chaos, be it a social media or an online coding platform. GitHub is one such trending online focal point for any business to reach their potential contributors and customers, simultaneously. By exploiting such software development paradigm, millions of free software emerged lately in diverse communities. To understand human influence, information spread and evolution of transmitted information among assorted users in GitHub, we developed a deep neural network model: "DeepFork", a supervised machine learning based approach that aims to predict information diffusion in complex social networks; considering node as well as topological features. In our empirical studies, we observed that information diffusion can be detected by link prediction using supervised learning. DeepFork outperforms other machine learning models as it better learns the discriminative patterns from the input features. DeepFork aids in understanding information spread and evolution through a bipartite network of users and repositories i.e., information flow from a user to repository to user.

deep learning, neural network, repository, (18 more...)

arXiv.org Artificial Intelligence

1910.07999

Country: North America > United States (0.69)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.74)
Information Technology > Services (0.58)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Multi-Task Learning Using Neighborhood Kernels

Yousefi, Niloofar, Li, Cong, Mollaghasemi, Mansooreh, Anagnostopoulos, Georgios, Georgiopoulos, Michael

arXiv.org Machine LearningJul-11-2017

This paper introduces a new and effective algorithm for learning kernels in a Multi-Task Learning (MTL) setting. Although, we consider a MTL scenario here, our approach can be easily applied to standard single task learning, as well. As shown by our empirical results, our algorithm consistently outperforms the traditional kernel learning algorithms such as uniform combination solution, convex combinations of base kernels as well as some kernel alignment-based models, which have been proven to give promising results in the past. We present a Rademacher complexity bound based on which a new Multi-Task Multiple Kernel Learning (MT-MKL) model is derived. In particular, we propose a Support Vector Machine-regularized model in which, for each task, an optimal kernel is learned based on a neighborhood-defining kernel that is not restricted to be positive semi-definite. Comparative experimental results are showcased that underline the merits of our neighborhood-defining framework in both classification and regression problems.

artificial intelligence, kernel, machine learning, (15 more...)

arXiv.org Machine Learning

1707.03426

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.86)

Add feedback