AITopics | Khalil, Mohammad

Collaborating Authors

Khalil, Mohammad

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Privacy-Preserving Data-Driven Education: The Potential of Federated Learning

Khalil, Mohammad, Shakya, Ronas, Liu, Qinyi

arXiv.org Artificial IntelligenceMar-16-2025

The increasing adoption of data-driven applications in education such as in learning analytics and AI in education has raised significant privacy and data protection concerns. While these challenges have been widely discussed in previous works, there are still limited practical solutions. Federated learning has recently been discoursed as a promising privacy-preserving technique, yet its application in education remains scarce. This paper presents an experimental evaluation of federated learning for educational data prediction, comparing its performance to traditional non-federated approaches. Our findings indicate that federated learning achieves comparable predictive accuracy. Furthermore, under adversarial attacks, federated learning demonstrates greater resilience compared to non-federated settings. We summarise that our results reinforce the value of federated learning as a potential approach for balancing predictive performance and privacy in educational contexts.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.1355

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

A Showdown of ChatGPT vs DeepSeek in Solving Programming Tasks

Shakya, Ronas, Vadiee, Farhad, Khalil, Mohammad

arXiv.org Artificial IntelligenceMar-16-2025

The advancement of large language models (LLMs) has created a competitive landscape for AI-assisted programming tools. This study evaluates two leading models: ChatGPT 03-mini and DeepSeek-R1 on their ability to solve competitive programming tasks from Codeforces. Using 29 programming tasks of three levels of easy, medium, and hard difficulty, we assessed the outcome of both models by their accepted solutions, memory efficiency, and runtime performance. Our results indicate that while both models perform similarly on easy tasks, ChatGPT outperforms DeepSeek-R1 on medium-difficulty tasks, achieving a 54.5% success rate compared to DeepSeek 18.1%. Both models struggled with hard tasks, thus highlighting some ongoing challenges LLMs face in handling highly complex programming problems. These findings highlight key differences in both model capabilities and their computational power, offering valuable insights for developers and researchers working to advance AI-driven programming tools.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.13549

Country: Europe > Norway (0.15)

Genre: Research Report > New Finding (0.88)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can Synthetic Data be Fair and Private? A Comparative Study of Synthetic Data Generation and Fairness Algorithms

Liu, Qinyi, Deho, Oscar, Vadiee, Farhad, Khalil, Mohammad, Joksimovic, Srecko, Siemens, George

arXiv.org Artificial IntelligenceJan-3-2025

The increasing use of machine learning in learning analytics (LA) has raised significant concerns around algorithmic fairness and privacy. Synthetic data has emerged as a dual-purpose tool, enhancing privacy and improving fairness in LA models. However, prior research suggests an inverse relationship between fairness and privacy, making it challenging to optimize both. This study investigates which synthetic data generators can best balance privacy and fairness, and whether pre-processing fairness algorithms, typically applied to real datasets, are effective on synthetic data. Our results highlight that the DEbiasing CAusal Fairness (DECAF) algorithm achieves the best balance between privacy and fairness. However, DECAF suffers in utility, as reflected in its predictive accuracy. Notably, we found that applying pre-processing fairness algorithms to synthetic data improves fairness even more than when applied to real data. These findings suggest that combining synthetic data generation with fairness pre-processing offers a promising approach to creating fairer LA models.

artificial intelligence, fairness, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.01785

Country:

Europe (0.47)
North America > United States (0.28)
Oceania > Australia > South Australia (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Education (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Creating Artificial Students that Never Existed: Leveraging Large Language Models and CTGANs for Synthetic Data Generation

Khalil, Mohammad, Vadiee, Farhad, Shakya, Ronas, Liu, Qinyi

arXiv.org Artificial IntelligenceJan-3-2025

In this study, we explore the growing potential of AI and deep learning technologies, particularly Generative Adversarial Networks (GANs) and Large Language Models (LLMs), for generating synthetic tabular data. Access to quality students data is critical for advancing learning analytics, but privacy concerns and stricter data protection regulations worldwide limit their availability and usage. Synthetic data offers a promising alternative. We investigate whether synthetic data can be leveraged to create artificial students for serving learning analytics models. Using the popular GAN model CTGAN and three LLMs- GPT2, DistilGPT2, and DialoGPT, we generate synthetic tabular student data. Our results demonstrate the strong potential of these methods to produce high-quality synthetic datasets that resemble real students data. To validate our findings, we apply a comprehensive set of utility evaluation metrics to assess the statistical and predictive performance of the synthetic data and compare the different generator models used, specially the performance of LLMs. Our study aims to provide the learning analytics community with valuable insights into the use of synthetic data, laying the groundwork for expanding the field methodological toolbox with new innovative approaches for learning analytics data generation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.01793

Country:

North America > United States > New York (0.14)
North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.68)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Scaling While Privacy Preserving: A Comprehensive Synthetic Tabular Data Generation and Evaluation in Learning Analytics

Liu, Qinyi, Khalil, Mohammad, Shakya, Ronas, Jovanovic, Jelena

arXiv.org Artificial IntelligenceJan-12-2024

Privacy poses a significant obstacle to the progress of learning analytics (LA), presenting challenges like inadequate anonymization and data misuse that current solutions struggle to address. Synthetic data emerges as a potential remedy, offering robust privacy protection. However, prior LA research on synthetic data lacks thorough evaluation, essential for assessing the delicate balance between privacy and data utility. Synthetic data must not only enhance privacy but also remain practical for data analytics. Moreover, diverse LA scenarios come with varying privacy and utility needs, making the selection of an appropriate synthetic data approach a pressing challenge. To address these gaps, we propose a comprehensive evaluation of synthetic data, which encompasses three dimensions of synthetic data quality, namely resemblance, utility, and privacy. We apply this evaluation to three distinct LA datasets, using three different synthetic data generation methods. Our results show that synthetic data can maintain similar utility (i.e., predictive performance) as real data, while preserving privacy. Furthermore, considering different privacy and data utility requirements in different LA scenarios, we make customized recommendations for synthetic data generation. This paper not only presents a comprehensive evaluation of synthetic data but also illustrates its potential in mitigating privacy concerns within the field of LA, thus contributing to a wider application of synthetic data in LA and promoting a better practice for open science.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2401.06883

Country:

Oceania > Australia (0.28)
Europe > United Kingdom (0.28)
North America > United States > New York (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.65)

Add feedback

Enhancing Polynomial Chaos Expansion Based Surrogate Modeling using a Novel Probabilistic Transfer Learning Strategy

Bridgman, Wyatt, Balakrishnan, Uma, Jones, Reese, Chen, Jiefu, Wu, Xuqing, Safta, Cosmin, Huang, Yueqin, Khalil, Mohammad

arXiv.org Machine LearningDec-7-2023

In the field of surrogate modeling, polynomial chaos expansion (PCE) allows practitioners to construct inexpensive yet accurate surrogates to be used in place of the expensive forward model simulations. For black-box simulations, non-intrusive PCE allows the construction of these surrogates using a set of simulation response evaluations. In this context, the PCE coefficients can be obtained using linear regression, which is also known as point collocation or stochastic response surfaces. Regression exhibits better scalability and can handle noisy function evaluations in contrast to other non-intrusive approaches, such as projection. However, since over-sampling is generally advisable for the linear regression approach, the simulation requirements become prohibitive for expensive forward models. We propose to leverage transfer learning whereby knowledge gained through similar PCE surrogate construction tasks (source domains) is transferred to a new surrogate-construction task (target domain) which has a limited number of forward model simulations (training data). The proposed transfer learning strategy determines how much, if any, information to transfer using new techniques inspired by Bayesian modeling and data assimilation. The strategy is scrutinized using numerical investigations and applied to an engineering problem from the oil and gas industry.

artificial intelligence, machine learning, objective function, (18 more...)

arXiv.org Machine Learning

2312.04648

Country:

North America > United States > California (0.14)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Energy > Oil & Gas > Upstream (0.93)

Add feedback

Transfer learning for predicting source terms of principal component transport in chemically reactive flow

Jung, Ki Sung, Echekki, Tarek, Chen, Jacqueline H., Khalil, Mohammad

arXiv.org Artificial IntelligenceDec-1-2023

The objective of this study is to evaluate whether the number of requisite training samples can be reduced with the use of various transfer learning models for predicting, for example, the chemical source terms of the data-driven reduced-order model that represents the homogeneous ignition process of a hydrogen/air mixture. Principal component analysis is applied to reduce the dimensionality of the hydrogen/air mixture in composition space. Artificial neural networks (ANNs) are used to tabulate the reaction rates of principal components, and subsequently, a system of ordinary differential equations is solved. As the number of training samples decreases at the target task (i.e.,for T0 > 1000 K and various phi), the reduced-order model fails to predict the ignition evolution of a hydrogen/air mixture. Three transfer learning strategies are then applied to the training of the ANN model with a sparse dataset. The performance of the reduced-order model with a sparse dataset is found to be remarkably enhanced if the training of the ANN model is restricted by a regularization term that controls the degree of knowledge transfer from source to target tasks. To this end, a novel transfer learning method is introduced, parameter control via partial initialization and regularization (PaPIR), whereby the amount of knowledge transferred is systemically adjusted for the initialization and regularization of the ANN model in the target task. It is found that an additional performance gain can be achieved by changing the initialization scheme of the ANN model in the target task when the task similarity between source and target tasks is relatively low.

artificial intelligence, machine learning, target task, (17 more...)

arXiv.org Artificial Intelligence

2312.00356

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.68)

Industry:

Energy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Robust scalable initialization for Bayesian variational inference with multi-modal Laplace approximations

Bridgman, Wyatt, Jones, Reese, Khalil, Mohammad

arXiv.org Machine LearningJul-12-2023

For predictive modeling relying on Bayesian inversion, fully independent, or ``mean-field'', Gaussian distributions are often used as approximate probability density functions in variational inference since the number of variational parameters is twice the number of unknown model parameters. The resulting diagonal covariance structure coupled with unimodal behavior can be too restrictive when dealing with highly non-Gaussian behavior, including multimodality. High-fidelity surrogate posteriors in the form of Gaussian mixtures can capture any distribution to an arbitrary degree of accuracy while maintaining some analytical tractability. Variational inference with Gaussian mixtures with full-covariance structures suffers from a quadratic growth in variational parameters with the number of model parameters. Coupled with the existence of multiple local minima due to nonconvex trends in the loss functions often associated with variational inference, these challenges motivate the need for robust initialization procedures to improve the performance and scalability of variational inference with mixture models. In this work, we propose a method for constructing an initial Gaussian mixture model approximation that can be used to warm-start the iterative solvers for variational inference. The procedure begins with an optimization stage in model parameter space in which local gradient-based optimization, globalized through multistart, is used to determine a set of local maxima, which we take to approximate the mixture component centers. Around each mode, a local Gaussian approximation is constructed via the Laplace method. Finally, the mixture weights are determined through constrained least squares regression. Robustness and scalability are demonstrated using synthetic tests. The methodology is applied to an inversion problem in structural dynamics involving unknown viscous damping coefficients.

approximation, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

2307.06424

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Health & Medicine (0.93)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
(2 more...)

Add feedback

Will ChatGPT get you caught? Rethinking of Plagiarism Detection

Khalil, Mohammad, Er, Erkan

arXiv.org Artificial IntelligenceFeb-8-2023

The rise of Artificial Intelligence (AI) technology and its impact on education has been a topic of growing concern in recent years. The new generation AI systems such as chatbots have become more accessible on the Internet and stronger in terms of capabilities. The use of chatbots, particularly ChatGPT, for generating academic essays at schools and colleges has sparked fears among scholars. This study aims to explore the originality of contents produced by one of the most popular AI chatbots, ChatGPT. To this end, two popular plagiarism detection tools were used to evaluate the originality of 50 essays generated by ChatGPT on various topics. Our results manifest that ChatGPT has a great potential to generate sophisticated text outputs without being well caught by the plagiarism check software. In other words, ChatGPT can create content on many topics with high originality as if they were written by someone. These findings align with the recent concerns about students using chatbots for an easy shortcut to success with minimal or no effort. Moreover, ChatGPT was asked to verify if the essays were generated by itself, as an additional measure of plagiarism check, and it showed superior performance compared to the traditional plagiarism-detection tools. The paper discusses the need for institutions to consider appropriate measures to mitigate potential plagiarism issues and advise on the ongoing debate surrounding the impact of AI technology on education. Further implications are discussed in the paper.

artificial intelligence, chatbot, natural language, (16 more...)

arXiv.org Artificial Intelligence

2302.04335

Country:

Oceania > Australia (0.29)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback