AITopics | Yazdani, Majid

Collaborating Authors

Yazdani, Majid

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration

Mohammadshahi, Alireza, Shaikh, Ali, Yazdani, Majid

arXiv.org Artificial IntelligenceJan-25-2024

In this paper, we propose an architecture to harness the collective knowledge of multiple trained LLMs to create a new state-of-the-art. At the core of this framework is a LLM-based orchestrator that is adept at picking the right underlying LLM experts for optimal task execution. Inspired by self-play in reinforcement learning, we created a loop of query generation, orchestration, and evaluation to generate training data for the orchestrator. Our evaluation focused on the MMLU benchmark, employing models with 7B, 13B, and 34B parameters available on Hugging Face. The results demonstrate new state-of-the-art open-source models: Our Leeroo orchestrator achieves performance on par with the Mixtral model while incurring only two-thirds of its cost. Moreover, increasing the allowed cost surpasses Mixtral's accuracy by over 5% at the same cost level, reaching an accuracy of 75.9%. Further enhancements were observed when integrating GPT4 into the underlying model pool. The Leeroo orchestrator nearly matches GPT4's performance at half the cost and even exceeds GPT4's results with a 25% cost reduction. These findings illustrate the potential of our architecture in creating state-of-the-art and cost-effective LLMs by optimizing the synergy between multiple LLMs to achieve superior performance outcomes.

large language model, leeroo-orch, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2401.13979

Country:

North America > Canada (0.14)
Europe (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question

Mohammadshahi, Alireza, Scialom, Thomas, Yazdani, Majid, Yanki, Pouya, Fan, Angela, Henderson, James, Saeidi, Marzieh

arXiv.org Artificial IntelligenceMay-26-2023

Existing metrics for evaluating the quality of automatically generated questions such as BLEU, ROUGE, BERTScore, and BLEURT compare the reference and predicted questions, providing a high score when there is a considerable lexical overlap or semantic similarity between the candidate and the reference questions. This approach has two major shortcomings. First, we need expensive human-provided reference questions. Second, it penalises valid questions that may not have high lexical or semantic similarity to the reference questions. In this paper, we propose a new metric, RQUGE, based on the answerability of the candidate question given the context. The metric consists of a question-answering and a span scorer modules, using pre-trained models from existing literature, thus it can be used without any further training. We demonstrate that RQUGE has a higher correlation with human judgment without relying on the reference question. Additionally, RQUGE is shown to be more robust to several adversarial corruptions. Furthermore, we illustrate that we can significantly improve the performance of QA models on out-of-domain datasets by fine-tuning on synthetic data generated by a question generation model and re-ranked by RQUGE.

computational linguistic, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2211.01482

Country:

Europe (1.00)
Asia (0.94)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Database Reasoning Over Text

Thorne, James, Yazdani, Majid, Saeidi, Marzieh, Silvestri, Fabrizio, Riedel, Sebastian, Halevy, Alon

arXiv.org Artificial IntelligenceJun-2-2021

Neural models have shown impressive performance gains in answering queries from natural language text. However, existing works are unable to support database queries, such as "List/Count all female athletes who were born in 20th century", which require reasoning over sets of relevant facts with operations such as join, filtering and aggregation. We show that while state-of-the-art transformer models perform very well for small databases, they exhibit limitations in processing noisy data, numerical operations, and queries that aggregate facts. We propose a modular architecture to answer these database-style queries over multiple spans from text and aggregating these at scale. We evaluate the architecture using WikiNLDB, a novel dataset for exploring such queries. Our architecture scales to databases containing thousands of facts whereas contemporary models are limited by how many facts can be encoded. In direct comparison on small databases, our approach increases overall answer accuracy from 85% to 90%. On larger databases, our approach retains its accuracy whereas transformer baselines could not encode the context.

artificial intelligence, information retrieval query processing, tennis, (18 more...)

arXiv.org Artificial Intelligence

2106.01074

Country: North America > United States > Minnesota (0.15)

Genre:

Personal (0.46)
Research Report (0.40)

Industry: Leisure & Entertainment > Sports > Tennis (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.70)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.34)

Add feedback

KILT: a Benchmark for Knowledge Intensive Language Tasks

Petroni, Fabio, Piktus, Aleksandra, Fan, Angela, Lewis, Patrick, Yazdani, Majid, De Cao, Nicola, Thorne, James, Jernite, Yacine, Plachouras, Vassilis, Rocktäschel, Tim, Riedel, Sebastian

arXiv.org Artificial IntelligenceSep-4-2020

Challenging problems such as open-domain question answering, fact checking, slot filling and entity linking require access to large, external knowledge sources. While some models do well on individual tasks, developing general models is difficult as each task might require computationally expensive indexing of custom knowledge sources, in addition to dedicated infrastructure. To catalyze research on models that condition on specific information in large textual resources, we present a benchmark for knowledge-intensive language tasks (KILT). All tasks in KILT are grounded in the same snapshot of Wikipedia, reducing engineering turnaround through the re-use of components, as well as accelerating research into task-agnostic memory architectures. We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text. KILT data and code are available at https://github.com/facebookresearch/KILT.

neural network, provenance, soccer, (18 more...)

arXiv.org Artificial Intelligence

2009.02252

Country:

North America > United States > Minnesota (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.69)
(3 more...)

Add feedback