AITopics | body text

Collaborating Authors

body text

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Empowering Interdisciplinary Research with BERT-Based Models: An Approach Through SciBERT-CNN with Topic Modeling

Likhareva, Darya, Sankaran, Hamsini, Thiyagarajan, Sivakumar

arXiv.org Artificial IntelligenceApr-23-2024

Researchers must stay current in their fields by regularly reviewing academic literature, a task complicated by the daily publication of thousands of papers. Traditional multi-label text classification methods often ignore semantic relationships and fail to address the inherent class imbalances. This paper introduces a novel approach using the SciBERT model and CNNs to systematically categorize academic abstracts from the Elsevier OA CC-BY corpus. We use a multi-segment input strategy that processes abstracts, body text, titles, and keywords obtained via BERT topic modeling through SciBERT. Here, the [CLS] token embeddings capture the contextual representation of each segment, concatenated and processed through a CNN. The CNN uses convolution and pooling to enhance feature extraction and reduce dimensionality, optimizing the data for classification. Additionally, we incorporate class weights based on label frequency to address the class imbalance, significantly improving the classification F1 score and enhancing text classification systems and literature review efficiency.

classification, dataset, text classification, (17 more...)

arXiv.org Artificial Intelligence

2404.13078

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ISQA: Informative Factuality Feedback for Scientific Summarization

Li, Zekai, Qin, Yanxia, Liu, Qian, Kan, Min-Yen

arXiv.org Artificial IntelligenceApr-19-2024

We propose Iterative Facuality Refining on Informative Scientific Question-Answering (ISQA) feedback\footnote{Code is available at \url{https://github.com/lizekai-richard/isqa}}, a method following human learning theories that employs model-generated feedback consisting of both positive and negative information. Through iterative refining of summaries, it probes for the underlying rationale of statements to enhance the factuality of scientific summarization. ISQA does this in a fine-grained manner by asking a summarization agent to reinforce validated statements in positive feedback and fix incorrect ones in negative feedback. Our findings demonstrate that the ISQA feedback mechanism significantly improves the factuality of various open-source LLMs on the summarization task, as evaluated across multiple scientific datasets.

computational linguistic, factuality, llm, (15 more...)

arXiv.org Artificial Intelligence

2404.13246

Country:

North America > Canada > Ontario > Toronto (0.05)
Asia > Singapore (0.05)
North America > Dominican Republic (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

How Good Are SOTA Fake News Detectors

Iceland, Matthew

arXiv.org Artificial IntelligenceAug-4-2023

Automatic fake news detection with machine learning can prevent the dissemination of false statements before they gain many views. Several datasets labeling statements as legitimate or false have been created since the 2016 United States presidential election for the prospect of training machine learning models. We evaluate the robustness of both traditional and deep state-of-the-art models to gauge how well they may perform in the real world. We find that traditional models tend to generalize better to data outside the distribution it was trained on compared to more recently-developed large language models, though the best model to use may depend on the specific task at hand.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2308.02727

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
North America > United States > Wisconsin (0.04)
(7 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Voting & Elections (0.86)
Media > News (0.78)
Government > Regional Government > North America Government > United States Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.71)

Add feedback

Emulating Reader Behaviors for Fake News Detection

Yin, Junwei, Gao, Min, Shu, Kai, Zhao, Zehua, Huang, Yinqiu, Wang, Jia

arXiv.org Artificial IntelligenceJun-27-2023

The wide dissemination of fake news has affected our lives in many aspects, making fake news detection important and attracting increasing attention. Existing approaches make substantial contributions in this field by modeling news from a single-modal or multi-modal perspective. However, these modal-based methods can result in sub-optimal outcomes as they ignore reader behaviors in news consumption and authenticity verification. For instance, they haven't taken into consideration the component-by-component reading process: from the headline, images, comments, to the body, which is essential for modeling news with more granularity. To this end, we propose an approach of Emulating the behaviors of readers (Ember) for fake news detection on social media, incorporating readers' reading and verificating process to model news from the component perspective thoroughly. Specifically, we first construct intra-component feature extractors to emulate the behaviors of semantic analyzing on each component. Then, we design a module that comprises inter-component feature extractors and a sequence-based aggregator. This module mimics the process of verifying the correlation between components and the overall reading and verification sequence. Thus, Ember can handle the news with various components by emulating corresponding sequences. We conduct extensive experiments on nine real-world datasets, and the results demonstrate the superiority of Ember.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2306.15231

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
Asia > China > Chongqing Province > Chongqing (0.06)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.70)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Detecting Contextomized Quotes in News Headlines by Contrastive Learning

Song, Seonyeong, Song, Hyeonho, Park, Kunwoo, Han, Jiyoung, Cha, Meeyoung

arXiv.org Artificial IntelligenceFeb-9-2023

Quotes are critical for establishing credibility in news articles. A direct quote enclosed in quotation marks has a strong visual appeal and is a sign of a reliable citation. Unfortunately, this journalistic practice is not strictly Figure 1: The central idea of QuoteCSE is based on followed, and a quote in the headline is often journalism principles, where quotes from news headlines "contextomized." Such a quote uses words out and body text should be matched. The proposed of context in a way that alters the speaker's contrastive learning framework maximizes the intention so that there is no semantically semantic similarity between the headline quote and the matching quote in the body text. We present matched quote in the body text while minimizing the QuoteCSE, a contrastive learning framework similarity for other unmatched quotes in the same or that represents the embedding of news quotes other articles.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2302.04465

Country:

North America > United States > Missouri (0.05)
Europe > Greece (0.05)
Asia > South Korea (0.04)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)

Genre: Research Report (1.00)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback

Incongruity Detection between Bangla News Headline and Body Content through Graph Neural Network

Palash, Md Aminul Haque, Khan, Akib, Islam, Kawsarul, Nasim, MD Abdullah Al, Shahjahan, Ryan Mohammad Bin

arXiv.org Artificial IntelligenceOct-26-2022

Incongruity between news headlines and the body content is a common method of deception used to attract readers. Profitable headlines pique readers' interest and encourage them to visit a specific website. This is usually done by adding an element of dishonesty, using enticements that do not precisely reflect the content being delivered. As a result, automatic detection of incongruent news between headline and body content using language analysis has gained the research community's attention. However, various solutions are primarily being developed for English to address this problem, leaving low-resource languages out of the picture. Bangla is ranked 7th among the top 100 most widely spoken languages, which motivates us to pay special attention to the Bangla language. Furthermore, Bangla has a more complex syntactic structure and fewer natural language processing resources, so it becomes challenging to perform NLP tasks like incongruity detection and stance detection. To tackle this problem, for the Bangla language, we offer a graph-based hierarchical dual encoder (BGHDE) model that learns the content similarity and contradiction between Bangla news headlines and content paragraphs effectively. The experimental results show that the proposed Bangla graph-based neural network model achieves above 90% accuracy on various Bangla news datasets.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.07709

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(6 more...)

Genre: Research Report > New Finding (0.88)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Combination Of Convolution Neural Networks And Deep Neural Networks For Fake News Detection

Jawad, Zainab A., Obaid, Ahmed J.

arXiv.org Artificial IntelligenceOct-15-2022

Nowadays, People prefer to follow the latest news on social media, as it is cheap, easily accessible, and quickly disseminated. However, it can spread fake or unreliable, low-quality news that intentionally contains false information. The spread of fake news can have a negative effect on people and society. Given the seriousness of such a problem, researchers did their best to identify patterns and characteristics that fake news may exhibit to design a system that can detect fake news before publishing. In this paper, we have described the Fake News Challenge stage #1 (FNC-1) dataset and given an overview of the competitive attempts to build a fake news detection system using the FNC-1 dataset. The proposed model was evaluated with the FNC-1 dataset. A competitive dataset is considered an open problem and a challenge worldwide. This system's procedure implies processing the text in the headline and body text columns with different natural language processing techniques. After that, the extracted features are reduced using the elbow truncated method, finding the similarity between each pair using the soft cosine similarity method. The new feature is entered into CNN and DNN deep learning approaches. The proposed system detects all the categories with high accuracy except the disagree category. As a result, the system achieves up to 84.6 % accuracy, classifying it as the second ranking based on other competitive studies regarding this dataset.

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2210.08331

Country: Asia > Middle East > Iraq > Najaf Governorate > Najaf (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Media > News (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine Learning-based Automatic Annotation and Detection of COVID-19 Fake News

Akhtar, Mohammad Majid, Sharma, Bibhas, Karunanayake, Ishan, Masood, Rahat, Ikram, Muhammad, Kanhere, Salil S.

arXiv.org Artificial IntelligenceSep-7-2022

COVID-19 impacted every part of the world, although the misinformation about the outbreak traveled faster than the virus. Misinformation spread through online social networks (OSN) often misled people from following correct medical practices. In particular, OSN bots have been a primary source of disseminating false information and initiating cyber propaganda. Existing work neglects the presence of bots that act as a catalyst in the spread and focuses on fake news detection in 'articles shared in posts' rather than the post (textual) content. Most work on misinformation detection uses manually labeled datasets that are hard to scale for building their predictive models. In this research, we overcome this challenge of data scarcity by proposing an automated approach for labeling data using verified fact-checked statements on a Twitter dataset. In addition, we combine textual features with user-level features (such as followers count and friends count) and tweet-level features (such as number of mentions, hashtags and urls in a tweet) to act as additional indicators to detect misinformation. Moreover, we analyzed the presence of bots in tweets and show that bots change their behavior over time and are most active during the misinformation campaign. We collected 10.22 Million COVID-19 related tweets and used our annotation model to build an extensive and original ground truth dataset for classification purposes. We utilize various machine learning models to accurately detect misinformation and our best classification model achieves precision (82%), recall (96%), and false positive rate (3.58%). Also, our bot analysis indicates that bots generated approximately 10% of misinformation tweets. Our methodology results in substantial exposure of false information, thus improving the trustworthiness of information disseminated through social media platforms.

dataset, misinformation, tweet, (16 more...)

arXiv.org Artificial Intelligence

2209.03162

Country:

Europe > United Kingdom (0.14)
Oceania > Australia > New South Wales > Sydney (0.05)
North America > United States > Indiana (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Less is Less: When Are Snippets Insufficient for Human vs Machine Relevance Estimation?

Kazai, Gabriella, Mitra, Bhaskar, Dong, Anlei, Craswell, Nick, Yang, Linjun

arXiv.org Artificial IntelligenceJan-21-2022

Traditional information retrieval (IR) ranking models process the full text of documents. Newer models based on Transformers, however, would incur a high computational cost when processing long texts, so typically use only snippets from the document instead. The model's input based on a document's URL, title, and snippet (UTS) is akin to the summaries that appear on a search engine results page (SERP) to help searchers decide which result to click. This raises questions about when such summaries are sufficient for relevance estimation by the ranking model or the human assessor, and whether humans and machines benefit from the document's full text in similar ways. To answer these questions, we study human and neural model based relevance assessments on 12k query-documents sampled from Bing's search logs. We compare changes in the relevance assessments when only the document summaries and when the full text is also exposed to assessors, studying a range of query and document properties, e.g., query type, snippet length. Our findings show that the full text is beneficial for humans and a BERT model for similar query and document types, e.g., tail, long queries. A closer look, however, reveals that humans and machines respond to the additional input in very different ways. Adding the full text can also hurt the ranker's performance, e.g., for navigational queries.

assessor, body text, query, (13 more...)

arXiv.org Artificial Intelligence

2201.08721

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Redmond (0.04)
Asia (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

POSHAN: Cardinal POS Pattern Guided Attention for News Headline Incongruence

Mishra, Rahul, Zhang, Shuo

arXiv.org Artificial IntelligenceNov-5-2021

Automatic detection of click-bait and incongruent news headlines is crucial to maintaining the reliability of the Web and has raised much research attention. However, most existing methods perform poorly when news headlines contain contextually important cardinal values, such as a quantity or an amount. In this work, we focus on this particular case and propose a neural attention based solution, which uses a novel cardinal Part of Speech (POS) tag pattern based hierarchical attention network, namely POSHAN, to learn effective representations of sentences in a news article. In addition, we investigate a novel cardinal phrase guided attention, which uses word embeddings of the contextually-important cardinal value and neighbouring words. In the experiments conducted on two publicly available datasets, we observe that the proposed methodgives appropriate significance to cardinal values and outperforms all the baselines. An ablation study of POSHAN shows that the cardinal POS-tag pattern-based hierarchical attention is very effective for the cases in which headlines contain cardinal values.

cardinal phrase, dataset, representation, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3459637.3482376

2111.03547

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia (0.04)
(9 more...)

Genre: Research Report (1.00)

Industry: Media > News (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
(3 more...)

Add feedback