AITopics | microblog

Collaborating Authors

microblog

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Novel End-To-End Event Geolocation Method Leveraging Hyperbolic Space and Toponym Hierarchies

Qiao, Yaqiong, Huang, Guojun

arXiv.org Artificial IntelligenceDec-14-2024

Abstract: Timely detection and geolocation of events based on social data can provide critical information for applications such as crisis response and resource allocation. However, most existing methods are greatly affected by event detection errors, leading to insufficient geolocation accuracy. To this end, this paper proposes a novel end-to-end event geolocation method (GTOP) leveraging Hyperbolic space and toponym hierarchies. Specifically, the proposed method contains one event detection module and one geolocation module. The event detection module constructs a heterogeneous information networks based on social data, and then constructs a homogeneous message graph and combines it with the text and time feature of the message to learning initial features of nodes. Node features are updated in Hyperbolic space and then fed into a classifier for event detection. To reduce the geolocation error, this paper proposes a noise toponym filtering algorithm (HIST) based on the hierarchical structure of toponyms. HIST analyzes the hierarchical structure of toponyms mentioned in the event cluster, taking the highly frequent city-level locations as the coarsegrained locations for events. To further improve the geolocation accuracy, we propose a fine-grained pseudo toponyms generation algorithm (FIT) based on the output of HIST, and combine generated pseudo toponyms with filtered toponyms to locate events based on the geographic center points of the combined toponyms. Extensive experiments are conducted on the Chinese dataset constructed in this paper and another public English dataset. The experimental results show that the proposed method is superior to the state-of-the-art baselines.

data mining, machine learning, toponym, (19 more...)

arXiv.org Artificial Intelligence

2412.1087

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
South America > Brazil > Ceará > Fortaleza (0.04)
(15 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology (0.68)
Law (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

Add feedback

QuakeBERT: Accurate Classification of Social Media Texts for Rapid Earthquake Impact Assessment

Han, Jin, Zheng, Zhe, Lu, Xin-Zheng, Chen, Ke-Yin, Lin, Jia-Rui

arXiv.org Artificial IntelligenceMay-6-2024

Social media aids disaster response but suffers from noise, hindering accurate impact assessment and decision making for resilient cities, which few studies considered. To address the problem, this study proposes the first domain-specific LLM model and an integrated method for rapid earthquake impact assessment. First, a few categories are introduced to classify and filter microblogs considering their relationship to the physical and social impacts of earthquakes, and a dataset comprising 7282 earthquake-related microblogs from twenty earthquakes in different locations is developed as well. Then, with a systematic analysis of various influential factors, QuakeBERT, a domain-specific large language model (LLM), is developed and fine-tuned for accurate classification and filtering of microblogs. Meanwhile, an integrated method integrating public opinion trend analysis, sentiment analysis, and keyword-based physical impact quantification is introduced to assess both the physical and social impacts of earthquakes based on social media texts. Experiments show that data diversity and data volume dominate the performance of QuakeBERT and increase the macro average F1 score by 27%, while the best classification model QuakeBERT outperforms the CNN- or RNN-based models by improving the macro average F1 score from 60.87% to 84.33%. Finally, the proposed approach is applied to assess two earthquakes with the same magnitude and focal depth. Results show that the proposed approach can effectively enhance the impact assessment process by accurate detection of noisy microblogs, which enables effective post-disaster emergency responses to create more resilient cities.

dataset, earthquake, microblog, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.ijdrr.2024.104574

2405.06684

Country:

North America > Haiti (0.14)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Hebei Province (0.04)
(9 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Services (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Health & Medicine > Therapeutic Area > Immunology (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.66)

Add feedback

IKDSumm: Incorporating Key-phrases into BERT for extractive Disaster Tweet Summarization

Garg, Piyush Kumar, Chakraborty, Roshni, Gupta, Srishti, Dandapat, Sourav Kumar

arXiv.org Artificial IntelligenceMay-19-2023

Online social media platforms, such as Twitter, are one of the most valuable sources of information during disaster events. Therefore, humanitarian organizations, government agencies, and volunteers rely on a summary of this information, i.e., tweets, for effective disaster management. Although there are several existing supervised and unsupervised approaches for automated tweet summary approaches, these approaches either require extensive labeled information or do not incorporate specific domain knowledge of disasters. Additionally, the most recent approaches to disaster summarization have proposed BERT-based models to enhance the summary quality. However, for further improved performance, we introduce the utilization of domain-specific knowledge without any human efforts to understand the importance (salience) of a tweet which further aids in summary creation and improves summary quality. In this paper, we propose a disaster-specific tweet summarization framework, IKDSumm, which initially identifies the crucial and important information from each tweet related to a disaster through key-phrases of that tweet. We identify these key-phrases by utilizing the domain knowledge (using existing ontology) of disasters without any human intervention. Further, we utilize these key-phrases to automatically generate a summary of the tweets. Therefore, given tweets related to a disaster, IKDSumm ensures fulfillment of the summarization key objectives, such as information coverage, relevance, and diversity in summary without any human intervention. We evaluate the performance of IKDSumm with 8 state-of-the-art techniques on 12 disaster datasets. The evaluation results show that IKDSumm outperforms existing techniques by approximately 2-79% in terms of ROUGE-N F1-score.

data mining, information retrieval, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2305.11592

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > Pakistan (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
(11 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry:

Education (1.00)
Health & Medicine (0.74)
Information Technology > Services (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Earthquake Impact Analysis Based on Text Mining and Social Media Analytics

Zheng, Zhe, Shi, Hong-Zheng, Zhou, Yu-Cheng, Lu, Xin-Zheng, Lin, Jia-Rui

arXiv.org Artificial IntelligenceDec-12-2022

Earthquakes have a deep impact on wide areas, and emergency rescue operations may benefit from social media information about the scope and extent of the disaster. Therefore, this work presents a text miningbased approach to collect and analyze social media data for early earthquake impact analysis. First, disasterrelated microblogs are collected from the Sina microblog based on crawler technology. Then, after data cleaning a series of analyses are conducted including (1) the hot words analysis, (2) the trend of the number of microblogs, (3) the trend of public opinion sentiment, and (4) a keyword and rule-based text classification for earthquake impact analysis. Finally, two recent earthquakes with the same magnitude and focal depth in China are analyzed to compare their impacts. The results show that the public opinion trend analysis and the trend of public opinion sentiment can estimate the earthquake's social impact at an early stage, which will be helpful to decision-making and rescue management.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2212.06765

Country:

North America > Haiti (0.14)
Asia > Japan > Honshū > Tōhoku (0.04)
Asia > China > Sichuan Province (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Social Sector (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Data Science > Data Mining > Text Mining (0.50)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Early Discovery of Disappearing Entities in Microblogs

Akasaki, Satoshi, Yoshinaga, Naoki, Toyoda, Masashi

arXiv.org Artificial IntelligenceOct-13-2022

We make decisions by reacting to changes in the real world, in particular, the emergence and disappearance of impermanent entities such as events, restaurants, and services. Because we want to avoid missing out on opportunities or making fruitless actions after they have disappeared, it is important to know when entities disappear as early as possible. We thus tackle the task of detecting disappearing entities from microblogs, whose posts mention various entities, in a timely manner. The major challenge is detecting uncertain contexts of disappearing entities from noisy microblog posts. To collect these disappearing contexts, we design time-sensitive distant supervision, which utilizes entities from the knowledge base and time-series posts, for this task to build large-scale Twitter datasets\footnote{We will release the datasets (tweet IDs) used in the experiments to promote reproducibility.} for English and Japanese. To ensure robust detection in noisy environments, we refine pretrained word embeddings of the detection model on microblog streams of the target day. Experimental results on the Twitter datasets confirmed the effectiveness of the collected labeled data and refined word embeddings; more than 70\% of the detected disappearing entities in Wikipedia are discovered earlier than the update on Wikipedia, and the average lead-time is over one month.

artificial intelligence, machine learning, social media, (19 more...)

arXiv.org Artificial Intelligence

2210.07404

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)
Europe > United Kingdom > Scotland (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report (0.50)

Industry:

Media (0.93)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probing Spurious Correlations in Popular Event-Based Rumor Detection Benchmarks

Wu, Jiaying, Hooi, Bryan

arXiv.org Artificial IntelligenceSep-19-2022

As social media becomes a hotbed for the spread of misinformation, the crucial task of rumor detection has witnessed promising advances fostered by open-source benchmark datasets. Despite being widely used, we find that these datasets suffer from spurious correlations, which are ignored by existing studies and lead to severe overestimation of existing rumor detection performance. The spurious correlations stem from three causes: (1) event-based data collection and labeling schemes assign the same veracity label to multiple highly similar posts from the same underlying event; (2) merging multiple data sources spuriously relates source identities to veracity labels; and (3) labeling bias. In this paper, we closely investigate three of the most popular rumor detection benchmark datasets (i.e., Twitter15, Twitter16 and PHEME), and propose event-separated rumor detection as a solution to eliminate spurious cues. Under the event-separated setting, we observe that the accuracy of existing state-of-the-art models drops significantly by over 40%, becoming only comparable to a simple neural classifier. To better address this task, we propose Publisher Style Aggregation (PSA), a generalizable approach that aggregates publisher posting records to learn writing style and veracity stance. Extensive experiments demonstrate that our method outperforms existing baselines in terms of effectiveness, efficiency and generalizability.

artificial intelligence, machine learning, social media, (14 more...)

arXiv.org Artificial Intelligence

2209.08799

Country:

Asia > Singapore > Central Region > Singapore (0.04)
Asia > Malaysia (0.04)

Genre: Research Report > Promising Solution (0.48)

Industry: Media > News (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Small Survey On Event Detection Using Twitter

Datta, Debanjan

arXiv.org Artificial IntelligenceJul-30-2022

This is evident from popular phenomena such as effects of fake news and online social movements. However the the data obtained from social media presents itself with large volume and velocity, accompanied by significant amount of irrelevant data pertaining to general discussions, personal messages and spam. Social media has been shown to be effective for detecting, forecasting and tracking real world events. The ability to detect real world events is crucial and has applications in disease surveillance, commerce, governance and other areas. Thus extraction of useful information and modelling the characteristics of social media to detect real world events is an important problem. 2 RESEARCH PROBLEM To outline the research problem we need to define events, which has multiple interpretations.

information retrieval, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2011.05801

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
Asia (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(2 more...)

Add feedback

T-BERT -- Model for Sentiment Analysis of Micro-blogs Integrating Topic Model and BERT

Palani, Sarojadevi, Rajagopal, Prabhu, Pancholi, Sidharth

arXiv.org Artificial IntelligenceJun-2-2021

Sentiment analysis (SA) has become an extensive research area in recent years impacting diverse fields including ecommerce, consumer business, and politics, driven by increasing adoption and usage of social media platforms. It is challenging to extract topics and sentiments from unsupervised short texts emerging in such contexts, as they may contain figurative words, strident data, and co-existence of many possible meanings for a single word or phrase, all contributing to obtaining incorrect topics. Most prior research is based on a specific theme/rhetoric/focused-content on a clean dataset. In the work reported here, the effectiveness of BERT(Bidirectional Encoder Representations from Transformers) in sentiment classification tasks from a raw live dataset taken from a popular microblogging platform is demonstrated. A novel T-BERT framework is proposed to show the enhanced performance obtainable by combining latent topics with contextual BERT embeddings. Numerical experiments were conducted on an ensemble with about 42000 datasets using NimbleBox.ai platform with a hardware configuration consisting of Nvidia Tesla K80(CUDA), 4 core CPU, 15GB RAM running on an isolated Google Cloud Platform instance. The empirical results show that the model improves in performance while adding topics to BERT and an accuracy rate of 90.81% on sentiment classification using BERT with the proposed approach.

bert, information, sentiment analysis, (12 more...)

arXiv.org Artificial Intelligence

2106.01097

Country:

North America > United States (0.28)
Asia > Singapore (0.04)
Asia > Indonesia > Java > East Java > Surabaya (0.04)
Asia > India > NCT > Delhi (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Streaming Social Event Detection and Evolution Discovery in Heterogeneous Information Networks

Peng, Hao, Li, Jianxin, Song, Yangqiu, Yang, Renyu, Ranjan, Rajiv, Yu, Philip S., He, Lifang

arXiv.org Artificial IntelligenceApr-1-2021

Events are happening in real-world and real-time, which can be planned and organized for occasions, such as social gatherings, festival celebrations, influential meetings or sports activities. Social media platforms generate a lot of real-time text information regarding public events with different topics. However, mining social events is challenging because events typically exhibit heterogeneous texture and metadata are often ambiguous. In this paper, we first design a novel event-based meta-schema to characterize the semantic relatedness of social events and then build an event-based heterogeneous information network (HIN) integrating information from external knowledge base. Second, we propose a novel Pairwise Popularity Graph Convolutional Network, named as PP-GCN, based on weighted meta-path instance similarity and textual semantic representation as inputs, to perform fine-grained social event categorization and learn the optimal weights of meta-paths in different tasks. Third, we propose a streaming social event detection and evolution discovery framework for HINs based on meta-path similarity search, historical information about meta-paths, and heterogeneous DBSCAN clustering method. Comprehensive experiments on real-world streaming social text data are conducted to compare various social event detection and evolution discovery algorithms. Experimental results demonstrate that our proposed framework outperforms other alternative social event detection and evolution discovery techniques.

discovery, event detection, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2104.00853

Country:

Europe > France > Île-de-France > Paris > Paris (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(7 more...)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Social Events (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(2 more...)

Add feedback

An Unsupervised Normalization Algorithm for Noisy Text: A Case Study for Information Retrieval and Stance Detection

Roy, Anurag, Ghosh, Shalmoli, Ghosh, Kripabandhu, Ghosh, Saptarshi

arXiv.org Artificial IntelligenceJan-9-2021

A large fraction of textual data available today contains various types of 'noise', such as OCR noise in digitized documents, noise due to informal writing style of users on microblogging sites, and so on. To enable tasks such as search/retrieval and classification over all the available data, we need robust algorithms for text normalization, i.e., for cleaning different kinds of noise in the text. There have been several efforts towards cleaning or normalizing noisy text; however, many of the existing text normalization methods are supervised and require language-dependent resources or large amounts of training data that is difficult to obtain. We propose an unsupervised algorithm for text normalization that does not need any training data / human intervention. The proposed algorithm is applicable to text over different languages, and can handle both machine-generated and human-generated noise. Experiments over several standard datasets show that text normalization through the proposed algorithm enables better retrieval and stance detection, as compared to that using several baseline text normalization methods. Implementation of our algorithm can be found at https://github.com/ranarag/UnsupClean.

algorithm, dataset, similarity, (13 more...)

arXiv.org Artificial Intelligence

2101.03303

Country:

North America > United States (0.28)
Asia > India > West Bengal > Kharagpur (0.04)
North America > Cuba (0.04)
(3 more...)

Genre:

Workflow (0.68)
Research Report > Experimental Study (0.46)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback