AITopics | Alassan, Mahaman Sanoussi Yahaya

Collaborating Authors

Alassan, Mahaman Sanoussi Yahaya

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Benchmarking Open-Source Language Models for Efficient Question Answering in Industrial Applications

Alassan, Mahaman Sanoussi Yahaya, Espejel, Jessica López, Bouhandi, Merieme, Dahhane, Walid, Ettifouri, El Hassane

arXiv.org Artificial IntelligenceJun-19-2024

In the rapidly evolving landscape of Natural Language Processing (NLP), Large Language Models (LLMs) have demonstrated remarkable capabilities in tasks such as question answering (QA). However, the accessibility and practicality of utilizing these models for industrial applications pose significant challenges, particularly concerning cost-effectiveness, inference speed, and resource efficiency. This paper presents a comprehensive benchmarking study comparing open-source LLMs with their non-open-source counterparts on the task of question answering. Our objective is to identify open-source alternatives capable of delivering comparable performance to proprietary models while being lightweight in terms of resource requirements and suitable for Central Processing Unit (CPU)-based inference. Through rigorous evaluation across various metrics including accuracy, inference speed, and resource consumption, we aim to provide insights into selecting efficient LLMs for real-world applications. Our findings shed light on viable open-source alternatives that offer acceptable performance and efficiency, addressing the pressing need for accessible and efficient NLP solutions in industry settings.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.13713

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine (0.93)
Education > Educational Setting > Online (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation

Espejel, Jessica López, Alassan, Mahaman Sanoussi Yahaya, Bouhandi, Merieme, Dahhane, Walid, Ettifouri, El Hassane

arXiv.org Artificial IntelligenceApr-17-2024

Large Language Models (LLMs) have become the go-to solution for many Natural Language Processing (NLP) tasks due to their ability to tackle various problems and produce high-quality results. Specifically, they are increasingly used to automatically generate code, easing the burden on developers by handling repetitive tasks. However, this improvement in quality has led to high computational and memory demands, making LLMs inaccessible to users with limited resources. In this paper, we focus on Central Processing Unit (CPU)-compatible models and conduct a thorough semi-manual evaluation of their strengths and weaknesses in generating Python code. We enhance their performance by introducing a Chain-of-Thought prompt that guides the model in problem-solving. Additionally, we propose a dataset of 60 programming problems with varying difficulty levels for evaluation purposes. Our assessment also includes testing these models on two state-of-the-art datasets: HumanEval and EvalPlus. We commit to sharing our dataset and experimental results publicly to ensure transparency.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2404.1116

Country:

Asia (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GPT-3.5, GPT-4, or BARD? Evaluating LLMs Reasoning Ability in Zero-Shot Setting and Performance Boosting Through Prompts

Espejel, Jessica López, Ettifouri, El Hassane, Alassan, Mahaman Sanoussi Yahaya, Chouham, El Mehdi, Dahhane, Walid

arXiv.org Artificial IntelligenceSep-19-2023

Large Language Models (LLMs) have exhibited remarkable performance on various Natural Language Processing (NLP) tasks. However, there is a current hot debate regarding their reasoning capacity. In this paper, we examine the performance of GPT-3.5, GPT-4, and BARD models, by performing a thorough technical evaluation on different reasoning tasks across eleven distinct datasets. Our paper provides empirical evidence showcasing the superior performance of ChatGPT-4 in comparison to both ChatGPT-3.5 and BARD in zero-shot setting throughout almost all evaluated tasks. While the superiority of GPT-4 compared to GPT-3.5 might be explained by its larger size and NLP efficiency, this was not evident for BARD. We also demonstrate that the three models show limited proficiency in Inductive, Mathematical, and Multi-hop Reasoning Tasks. To bolster our findings, we present a detailed and comprehensive analysis of the results from these three models. Furthermore, we propose a set of engineered prompts that enhances the zero-shot setting performance of all three models.

large language model, machine learning, natural language, (9 more...)

arXiv.org Artificial Intelligence

2305.12477

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Entity Identifier: A Natural Text Parsing-based Framework For Entity Relation Extraction

Chouham, El Mehdi, Espejel, Jessica López, Alassan, Mahaman Sanoussi Yahaya, Dahhane, Walid, Ettifouri, El Hassane

arXiv.org Artificial IntelligenceJul-10-2023

The field of programming has a diversity of paradigms that are used according to the working framework. While current neural code generation methods are able to learn and generate code directly from text, we believe that this approach is not optimal for certain code tasks, particularly the generation of classes in an object-oriented project. Specifically, we use natural language processing techniques to extract structured information from requirements descriptions, in order to automate the generation of CRUD (Create, Read, Update, Delete) class code. To facilitate this process, we introduce a pipeline for extracting entity and relation information, as well as a representation called an "Entity Tree" to model this information. We also create a dataset to evaluate the effectiveness of our approach.

artificial intelligence, natural language, text processing, (15 more...)

arXiv.org Artificial Intelligence

2307.04892

Country:

Europe > France (0.14)
North America > United States > Louisiana (0.14)
North America > Canada (0.14)
Europe > Portugal (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

A Comprehensive Review of State-of-The-Art Methods for Java Code Generation from Natural Language Text

Espejel, Jessica López, Alassan, Mahaman Sanoussi Yahaya, Chouham, El Mehdi, Dahhane, Walid, Ettifouri, El Hassane

arXiv.org Artificial IntelligenceJun-10-2023

Java Code Generation consists in generating automatically Java code from a Natural Language Text. This NLP task helps in increasing programmers' productivity by providing them with immediate solutions to the simplest and most repetitive tasks. Code generation is a challenging task because of the hard syntactic rules and the necessity of a deep understanding of the semantic aspect of the programming language. Many works tried to tackle this task using either RNN-based, or Transformer-based models. The latter achieved remarkable advancement in the domain and they can be divided into three groups: (1) encoder-only models, (2) decoder-only models, and (3) encoder-decoder models. In this paper, we provide a comprehensive review of the evolution and progress of deep learning models in Java code generation task. We focus on the most important methods and present their merits and limitations, as well as the objective functions used by the community. In addition, we provide a detailed description of datasets and evaluation metrics used in the literature. Finally, we discuss results of different models on CONCODE dataset, then propose some future directions.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.nlp.2023.100013

2306.06371

Country:

Europe (0.92)
North America > United States > California (0.28)
North America > United States > New York > New York County > New York City (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

JaCoText: A Pretrained Model for Java Code-Text Generation

Espejel, Jessica López, Alassan, Mahaman Sanoussi Yahaya, Dahhane, Walid, Ettifouri, El Hassane

arXiv.org Artificial IntelligenceMar-22-2023

Pretrained transformer-based models have shown high performance in natural language generation task. However, a new wave of interest has surged: automatic programming language generation. This task consists of translating natural language instructions to a programming code. Despite the fact that well-known pretrained models on language generation have achieved good performance in learning programming languages, effort is still needed in automatic code generation. In this paper, we introduce JaCoText, a model based on Transformers neural network. It aims to generate java source code from natural language text. JaCoText leverages advantages of both natural language and code generation models. More specifically, we study some findings from the state of the art and use them to (1) initialize our model from powerful pretrained models, (2) explore additional pretraining on our java dataset, (3) carry out experiments combining the unimodal and bimodal data in the training, and (4) scale the input and output length during the fine-tuning of the model. Conducted experiments on CONCODE dataset show that JaCoText achieves new state-of-the-art results.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2303.12869

Country: Europe > France (0.28)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback