AITopics | Hosseini, Mohammad Javad

Collaborating Authors

Hosseini, Mohammad Javad

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Entailed Between the Lines: Incorporating Implication into NLI

Havaldar, Shreya, Alvari, Hamidreza, Fabrikant, Alex, Palowitch, John, Hosseini, Mohammad Javad, Buthpitiya, Senaka

arXiv.org Artificial IntelligenceJan-13-2025

Much of human communication depends on implication, conveying meaning beyond literal words to express a wider range of thoughts, intentions, and feelings. For models to better understand and facilitate human communication, they must be responsive to the text's implicit meaning. We focus on Natural Language Inference (NLI), a core tool for many language tasks, and find that state-of-the-art NLI models and datasets struggle to recognize a range of cases where entailment is implied, rather than explicit from the text. We formalize implied entailment as an extension of the NLI task and introduce the Implied NLI dataset (INLI) to help today's LLMs both recognize a broader variety of implied entailments and to distinguish between implicit and explicit entailment. We show how LLMs fine-tuned on INLI understand implied entailment and can generalize this understanding across datasets and domains.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2501.07719

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Scalable and Domain-General Abstractive Proposition Segmentation

Hosseini, Mohammad Javad, Gao, Yang, Baumgärtner, Tim, Fabrikant, Alex, Amplayo, Reinald Kim

arXiv.org Artificial IntelligenceJun-28-2024

Segmenting text into fine-grained units of meaning is important to a wide range of NLP applications. The default approach of segmenting text into sentences is often insufficient, especially since sentences are usually complex enough to include multiple units of meaning that merit separate treatment in the downstream task. We focus on the task of abstractive proposition segmentation: transforming text into simple, self-contained, well-formed sentences. Several recent works have demonstrated the utility of proposition segmentation with few-shot prompted LLMs for downstream tasks such as retrieval-augmented grounding and fact verification. However, this approach does not scale to large amounts of text and may not always extract all the facts from the input text. In this paper, we first introduce evaluation metrics for the task to measure several dimensions of quality. We then propose a scalable, yet accurate, proposition segmentation model. We model proposition segmentation as a supervised task by training LLMs on existing annotated datasets and show that training yields significantly improved results. We further show that by using the fine-tuned LLMs as teachers for annotating large amounts of multi-domain synthetic distillation data, we can train smaller student models with results similar to the teacher LLMs. We then demonstrate that our technique leads to effective domain generalization, by annotating data in two domains outside the original training data and evaluating on them. Finally, as a key contribution of the paper, we share an easy-to-use API for NLP practitioners to use.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.19803

Country:

Europe > United Kingdom > England (0.14)
North America > United States > Massachusetts (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (0.46)
Education (0.36)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Sources of Hallucination by Large Language Models on Inference Tasks

McKenna, Nick, Li, Tianyi, Cheng, Liang, Hosseini, Mohammad Javad, Johnson, Mark, Steedman, Mark

arXiv.org Artificial IntelligenceOct-22-2023

Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA, GPT-3.5, and PaLM) which probe their behavior using controlled experiments. We establish two biases originating from pretraining which predict much of their behavior, and show that these are major sources of hallucination in generative LLMs. First, memorization at the level of sentences: we show that, regardless of the premise, models falsely label NLI test samples as entailing when the hypothesis is attested in training data, and that entities are used as ``indices'' to access the memorized data. Second, statistical patterns of usage learned at the level of corpora: we further show a similar effect when the premise predicate is less frequent than that of the hypothesis in the training data, a bias following from previous studies. We demonstrate that LLMs perform significantly worse on NLI test samples which do not conform to these biases than those which do, and we offer these as valuable controls for future LLM evaluation.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2305.14552

Country:

Europe (1.00)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction

Milbauer, Jeremiah, Louis, Annie, Hosseini, Mohammad Javad, Fabrikant, Alex, Metzler, Donald, Schuster, Tal

arXiv.org Artificial IntelligenceMay-31-2023

Transformer encoders contextualize token representations by attending to all other tokens at each layer, leading to quadratic increase in compute effort with the input length. In practice, however, the input text of many NLP tasks can be seen as a sequence of related segments (e.g., the sequence of sentences within a passage, or the hypothesis and premise in NLI). While attending across these segments is highly beneficial for many tasks, we hypothesize that this interaction can be delayed until later encoding stages. To this end, we introduce Layer-Adjustable Interactions in Transformers (LAIT). Within LAIT, segmented inputs are first encoded independently, and then jointly. This partial two-tower architecture bridges the gap between a Dual Encoder's ability to pre-compute representations for segments and a fully self-attentive Transformer's capacity to model cross-segment attention. The LAIT framework effectively leverages existing pretrained Transformers and converts them into the hybrid of the two aforementioned architectures, allowing for easy and intuitive control over the performance-efficiency tradeoff. Experimenting on a wide range of NLP tasks, we find LAIT able to reduce 30-50% of the attention FLOPs on many tasks, while preserving high accuracy; in some practical settings, LAIT could reduce actual latency by orders of magnitude.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2305.19585

Country:

Europe (0.67)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Resolving Indirect Referring Expressions for Entity Selection

Hosseini, Mohammad Javad, Radlinski, Filip, Pareti, Silvia, Louis, Annie

arXiv.org Artificial IntelligenceMay-26-2023

Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address this problem of reference resolution, when people use natural expressions to choose between the entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural response from a dialog participant may be indirect: `let's make the green one'. Such natural expressions have been little studied for reference resolution. We argue that robustly understanding such language has large potential for improving naturalness in dialog, recommendation, and search systems. We create AltEntities (Alternative Entities), a new public dataset of 42K entity pairs and expressions (referring to one entity in the pair), and develop models for the disambiguation problem. Consisting of indirect referring expressions across three domains, our corpus enables for the first time the study of how language models can be adapted to this task. We find they achieve 82%-87% accuracy in realistic settings, which while reasonable also invites further advances.

annotator, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2212.10933

Country:

Europe (0.67)
Asia > Middle East (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > Experimental Study (0.67)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)

Add feedback

Learning Sparse Gaussian Graphical Models with Overlapping Blocks

Hosseini, Mohammad Javad, Lee, Su-In

Neural Information Processing SystemsDec-31-2016

We present a novel framework, called GRAB (GRaphical models with overlApping Blocks), to capture densely connected components in a network estimate. GRAB takes as input a data matrix of p variables and n samples, and jointly learns both a network among p variables and densely connected groups of variables (called `blocks'). GRAB has four major novelties as compared to existing network estimation methods: 1) It does not require the blocks to be given a priori. 2) Blocks can overlap. 3) It can jointly learn a network structure and overlapping blocks. 4) It solves a joint optimization problem with the block coordinate descent method that is convex in each step. We show that GRAB reveals the underlying network structure substantially better than four state-of-the-art competitors on synthetic data. When applied to cancer gene expression data, GRAB outperforms its competitors in revealing known functional gene sets and potentially novel genes that drive cancer.

health & medicine, matrix, oncology, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback