Goto

Collaborating Authors

 Information Retrieval


Evo* 2022 -- Late-Breaking Abstracts Volume

arXiv.org Artificial Intelligence

This volume contains the Late-Breaking Abstracts accepted at Evo* 2022 Conference, held in Madrid (Spain), from 20 to 22 of April. They were also presented as short talks as well as at the conference's poster session. The works present ongoing research and preliminary results investigating on the application of different approaches of Evolutionary Computation and other Nature-Inspired techniques to different problems, most of them real world ones. These are very promising contributions, since they outline some of the incoming advances and applications in the area of nature-inspired methods, mainly Evolutionary Algorithms.


A Small Survey On Event Detection Using Twitter

arXiv.org Artificial Intelligence

This is evident from popular phenomena such as effects of fake news and online social movements. However the the data obtained from social media presents itself with large volume and velocity, accompanied by significant amount of irrelevant data pertaining to general discussions, personal messages and spam. Social media has been shown to be effective for detecting, forecasting and tracking real world events. The ability to detect real world events is crucial and has applications in disease surveillance, commerce, governance and other areas. Thus extraction of useful information and modelling the characteristics of social media to detect real world events is an important problem. 2 RESEARCH PROBLEM To outline the research problem we need to define events, which has multiple interpretations.


PHEMEPlus: Enriching Social Media Rumour Verification with External Evidence

arXiv.org Artificial Intelligence

Work on social media rumour verification utilises signals from posts, their propagation and users involved. Other lines of work target identifying and fact-checking claims based on information from Wikipedia, or trustworthy news articles without considering social media context. However works combining the information from social media with external evidence from the wider web are lacking. To facilitate research in this direction, we release a novel dataset, PHEMEPlus, an extension of the PHEME benchmark, which contains social media conversations as well as relevant external evidence for each rumour. We demonstrate the effectiveness of incorporating such evidence in improving rumour verification models. Additionally, as part of the evidence collection, we evaluate various ways of query formulation to identify the most effective method.


AI Tools Streamline Content Marketing and SEO

#artificialintelligence

The aim of content marketing is to attract, engage, and retain customers. It takes many forms, including videos, podcasts, graphics, articles, and whitepapers. Each of those could have a sub-task. This article focuses on attracting an audience -- driving top-of-the-funnel prospects -- with blog content. A blog post that ranks well on search engine results pages must include the words and phrases of searchers.


UNIMIB at TREC 2021 Clinical Trials Track

arXiv.org Artificial Intelligence

This contribution summarizes the participation of the UNIMIB team to the TREC 2021 Clinical Trials Track. We have investigated the effect of different query representations combined with several retrieval models on the retrieval performance. First, we have implemented a neural re-ranking approach to study the effectiveness of dense text representations. Additionally, we have investigated the effectiveness of a novel decision-theoretic model for relevance estimation. Finally, both of the above relevance models have been compared with standard retrieval approaches. In particular, we combined a keyword extraction method with a standard retrieval process based on the BM25 model and a decision-theoretic relevance model that exploits the characteristics of this particular search task. The obtained results show that the proposed keyword extraction method improves 84% of the queries over the TREC's median NDCG@10 measure when combined with either traditional or decision-theoretic relevance models. Moreover, regarding RPEC@10, the employed decision-theoretic model improves 85% of the queries over the reported TREC's median value.


BioADAPT-MRC: Adversarial Learning-based Domain Adaptation Improves Biomedical Machine Reading Comprehension Task

arXiv.org Artificial Intelligence

Biomedical machine reading comprehension (biomedical-MRC) aims to comprehend complex biomedical narratives and assist healthcare professionals in retrieving information from them. The high performance of modern neural network-based MRC systems depends on high-quality, large-scale, human-annotated training datasets. In the biomedical domain, a crucial challenge in creating such datasets is the requirement for domain knowledge, inducing the scarcity of labeled data and the need for transfer learning from the labeled general-purpose (source) domain to the biomedical (target) domain. However, there is a discrepancy in marginal distributions between the general-purpose and biomedical domains due to the variances in topics. Therefore, direct-transferring of learned representations from a model trained on a general-purpose domain to the biomedical domain can hurt the model's performance. We present an adversarial learning-based domain adaptation framework for the biomedical machine reading comprehension task (BioADAPT-MRC), a neural network-based method to address the discrepancies in the marginal distributions between the general and biomedical domain datasets. BioADAPT-MRC relaxes the need for generating pseudo labels for training a well-performing biomedical-MRC model. We extensively evaluate the performance of BioADAPT-MRC by comparing it with the best existing methods on three widely used benchmark biomedical-MRC datasets -- BioASQ-7b, BioASQ-8b, and BioASQ-9b. Our results suggest that without using any synthetic or human-annotated data from the biomedical domain, BioADAPT-MRC can achieve state-of-the-art performance on these datasets. Availability: BioADAPT-MRC is freely available as an open-source project at \url{https://github.com/mmahbub/BioADAPT-MRC}.


Buffer Pool Aware Query Scheduling via Deep Reinforcement Learning

arXiv.org Artificial Intelligence

One could imagine many simple heuristics, query scheduling with the explicit goal of reducing disk reads such as greedily selecting the next query with the highest and thus implicitly increasing query performance. We introduce expected buffer usage, to solve this problem. However, a SmartQueue, a learned scheduler that leverages overlapping hand-designed policy to handle the complexity of the entire data reads among incoming queries and learns a problem, including different buffer sizes, shifting query scheduling strategy that improves cache hits. SmartQueue workloads, heterogeneous data types (e.g., index files vs base relies on deep reinforcement learning to produce workloadspecific relations), and balancing short-term gains against long-term scheduling strategies that focus on long-term performance strategy is much more difficult to conceive.


What you Know about Keywords and their Importance in SEO

#artificialintelligence

Modern internet growth has created the need for many skills, which are called digital skills. One of these skills is search engine optimization. So people can improve their skills by reading online content from our website. In this article, although our intended readers are beginners, professionals can also refresh their knowledge. If you know about keywords, then you must know about search engine optimization.


Facing Changes: Continual Entity Alignment for Growing Knowledge Graphs

arXiv.org Artificial Intelligence

Entity alignment is a basic and vital technique in knowledge graph (KG) integration. Over the years, research on entity alignment has resided on the assumption that KGs are static, which neglects the nature of growth of real-world KGs. As KGs grow, previous alignment results face the need to be revisited while new entity alignment waits to be discovered. In this paper, we propose and dive into a realistic yet unexplored setting, referred to as continual entity alignment. To avoid retraining an entire model on the whole KGs whenever new entities and triples come, we present a continual alignment method for this task. It reconstructs an entity's representation based on entity adjacency, enabling it to generate embeddings for new entities quickly and inductively using their existing neighbors. It selects and replays partial pre-aligned entity pairs to train only parts of KGs while extracting trustworthy alignment for knowledge augmentation. As growing KGs inevitably contain non-matchable entities, different from previous works, the proposed method employs bidirectional nearest neighbor matching to find new entity alignment and update old alignment. Furthermore, we also construct new datasets by simulating the growth of multilingual DBpedia. Extensive experiments demonstrate that our continual alignment method is more effective than baselines based on retraining or inductive learning.


Meta AI introduces Sphere, a model designed to verify citations on Wikipedia - Actu IA

#artificialintelligence

When we do a search on the Internet, the search engine very often suggests the site of the community encyclopedia Wikipedia. It contains about 6.5 million articles by volunteer contributors, but how can we know if these are reliable, even though the sources of the articles are cited? Meta relied on Meta AI's research and advances to develop SPHERE, an open source model capable of automatically analyzing hundreds of thousands of citations at a time to check whether they actually support the corresponding claims, it recently published it on the Github platform. Meta said it is not partnering with Wikimedia, the foundation that runs Wikipedia, on this project. Its goal is to create a platform to help Wikipedia editors systematically spot citation problems and quickly correct the citation or the corresponding article content.