Overview
A Brief History of Named Entity Recognition
A large amount of information in today's world is now stored in knowledge bases. Named Entity Recognition (NER) is a process of extracting, disambiguation, and linking an entity from raw text to insightful and structured knowledge bases. More concretely, it is identifying and classifying entities in the text that are crucial for Information Extraction, Semantic Annotation, Question Answering, Ontology Population, and so on. The process of NER has evolved in the last three decades since it first appeared in 1996. In this survey, we study the evolution of techniques employed for NER and compare the results, starting from supervised to the developing unsupervised learning methods.
Intellectual Property Protection for Deep Learning Model and Dataset Intelligence
Jiang, Yongqi, Gao, Yansong, Zhou, Chunyi, Hu, Hongsheng, Fu, Anmin, Susilo, Willy
With the growing applications of Deep Learning (DL), especially recent spectacular achievements of Large Language Models (LLMs) such as ChatGPT and LLaMA, the commercial significance of these remarkable models has soared. However, acquiring well-trained models is costly and resource-intensive. It requires a considerable high-quality dataset, substantial investment in dedicated architecture design, expensive computational resources, and efforts to develop technical expertise. Consequently, safeguarding the Intellectual Property (IP) of well-trained models is attracting increasing attention. In contrast to existing surveys overwhelmingly focusing on model IPP mainly, this survey not only encompasses the protection on model level intelligence but also valuable dataset intelligence. Firstly, according to the requirements for effective IPP design, this work systematically summarizes the general and scheme-specific performance evaluation metrics. Secondly, from proactive IP infringement prevention and reactive IP ownership verification perspectives, it comprehensively investigates and analyzes the existing IPP methods for both dataset and model intelligence. Additionally, from the standpoint of training settings, it delves into the unique challenges that distributed settings pose to IPP compared to centralized settings. Furthermore, this work examines various attacks faced by deep IPP techniques. Finally, we outline prospects for promising future directions that may act as a guide for innovative research.
Survey on Semantic Interpretation of Tabular Data: Challenges and Directions
Cremaschi, Marco, Spahiu, Blerina, Palmonari, Matteo, Jimenez-Ruiz, Ernesto
Tabular data plays a pivotal role in various fields, making it a popular format for data manipulation and exchange, particularly on the web. The interpretation, extraction, and processing of tabular information are invaluable for knowledge-intensive applications. Notably, significant efforts have been invested in annotating tabular data with ontologies and entities from background knowledge graphs, a process known as Semantic Table Interpretation (STI). STI automation aids in building knowledge graphs, enriching data, and enhancing web-based question answering. This survey aims to provide a comprehensive overview of the STI landscape. It starts by categorizing approaches using a taxonomy of 31 attributes, allowing for comparisons and evaluations. It also examines available tools, assessing them based on 12 criteria. Furthermore, the survey offers an in-depth analysis of the Gold Standards used for evaluating STI approaches. Finally, it provides practical guidance to help end-users choose the most suitable approach for their specific tasks while also discussing unresolved issues and suggesting potential future research directions.
Interplay between Federated Learning and Explainable Artificial Intelligence: a Scoping Review
Lopez-Ramos, Luis M., Leiser, Florian, Rastogi, Aditya, Hicks, Steven, Strรผmke, Inga, Madai, Vince I., Budig, Tobias, Sunyaev, Ali, Hilbert, Adam
The joint implementation of Federated learning (FL) and Explainable artificial intelligence (XAI) will allow training models from distributed data and explaining their inner workings while preserving important aspects of privacy. Towards establishing the benefits and tensions associated with their interplay, this scoping review maps those publications that jointly deal with FL and XAI, focusing on publications where an interplay between FL and model interpretability or post-hoc explanations was found. In total, 37 studies met our criteria, with more papers focusing on explanation methods (mainly feature relevance) than on interpretability (mainly algorithmic transparency). Most works used simulated horizontal FL setups involving 10 or fewer data centers. Only one study explicitly and quantitatively analyzed the influence of FL on model explanations, revealing a significant research gap. Aggregation of interpretability metrics across FL nodes created generalized global insights at the expense of node-specific patterns being diluted. 8 papers addressed the benefits of incorporating explanation methods as a component of the FL algorithm. Studies using established FL libraries or following reporting guidelines are a minority. More quantitative research and structured, transparent practices are needed to fully understand their mutual impact and under which conditions it happens.
Seeing is Deceiving: Exploitation of Visual Pathways in Multi-Modal Language Models
Janowczyk, Pete, Laurier, Linda, Giulietta, Ave, Octavia, Arlo, Cleti, Meade
Multi-Modal Language Models (MLLMs) have transformed artificial intelligence by combining visual and text data, making applications like image captioning, visual question answering, and multi-modal content creation possible. This ability to understand and work with complex information has made MLLMs useful in areas such as healthcare, autonomous systems, and digital content. However, integrating multiple types of data also creates security risks. Attackers can manipulate either the visual or text inputs, or both, to make the model produce unintended or even harmful responses. This paper reviews how visual inputs in MLLMs can be exploited by various attack strategies. We break down these attacks into categories: simple visual tweaks and cross-modal manipulations, as well as advanced strategies like VLATTACK, HADES, and Collaborative Multimodal Adversarial Attack (Co-Attack). These attacks can mislead even the most robust models while looking nearly identical to the original visuals, making them hard to detect. We also discuss the broader security risks, including threats to privacy and safety in important applications. To counter these risks, we review current defense methods like the SmoothVLM framework, pixel-wise randomization, and MirrorCheck, looking at their strengths and limitations. We also discuss new methods to make MLLMs more secure, including adaptive defenses, better evaluation tools, and security approaches that protect both visual and text data. By bringing together recent developments and identifying key areas for improvement, this review aims to support the creation of more secure and reliable multi-modal AI systems for real-world use.
Beyond the Numbers: Transparency in Relation Extraction Benchmark Creation and Leaderboards
This paper investigates the transparency in the creation of benchmarks and the use of leaderboards for measuring progress in NLP, with a focus on the relation extraction (RE) task. Existing RE benchmarks often suffer from insufficient documentation, lacking crucial details such as data sources, inter-annotator agreement, the algorithms used for the selection of instances for datasets, and information on potential biases like dataset imbalance. Progress in RE is frequently measured by leaderboards that rank systems based on evaluation methods, typically limited to aggregate metrics like F1-score. However, the absence of detailed performance analysis beyond these metrics can obscure the true generalisation capabilities of models. Our analysis reveals that widely used RE benchmarks, such as TACRED and NYT, tend to be highly imbalanced and contain noisy labels. Moreover, the lack of class-based performance metrics fails to accurately reflect model performance across datasets with a large number of relation types. These limitations should be carefully considered when reporting progress in RE. While our discussion centers on the transparency of RE benchmarks and leaderboards, the observations we discuss are broadly applicable to other NLP tasks as well. Rather than undermining the significance and value of existing RE benchmarks and the development of new models, this paper advocates for improved documentation and more rigorous evaluation to advance the field.
Boosting the Efficiency of Metaheuristics Through Opposition-Based Learning in Optimum Locating of Control Systems in Tall Buildings
Farahmand-Tabar, Salar, Shirgir, Sina
Opposition-based learning (OBL) is an effective approach to improve the performance of metaheuristic optimization algorithms, which are commonly used for solving complex engineering problems. This chapter provides a comprehensive review of the literature on the use of opposition strategies in metaheuristic optimization algorithms, discussing the benefits and limitations of this approach. An overview of the opposition strategy concept, its various implementations, and its impact on the performance of metaheuristic algorithms are presented. Furthermore, case studies on the application of opposition strategies in engineering problems are provided, including the optimum locating of control systems in tall building. A shear frame with Magnetorheological (MR) fluid damper is considered as a case study. The results demonstrate that the incorporation of opposition strategies in metaheuristic algorithms significantly enhances the quality and speed of the optimization process. This chapter aims to provide a clear understanding of the opposition strategy in metaheuristic optimization algorithms and its engineering applications, with the ultimate goal of facilitating its adoption in real-world engineering problems.
A Guide to Misinformation Detection Datasets
Thibault, Camille, Peloquin-Skulski, Gabrielle, Tian, Jacob-Junqi, Laflamme, Florence, Guan, Yuxiang, Rabbany, Reihaneh, Godbout, Jean-Franรงois, Pelrine, Kellin
Misinformation is a complex societal issue, and mitigating solutions are difficult to create due to data deficiencies. To address this problem, we have curated the largest collection of (mis)information datasets in the literature, totaling 75. From these, we evaluated the quality of all of the 36 datasets that consist of statements or claims. We assess these datasets to identify those with solid foundations for empirical work and those with flaws that could result in misleading and non-generalizable results, such as insufficient label quality, spurious correlations, or political bias. We further provide state-of-the-art baselines on all these datasets, but show that regardless of label quality, categorical labels may no longer give an accurate evaluation of detection model performance. We discuss alternatives to mitigate this problem. Overall, this guide aims to provide a roadmap for obtaining higher quality data and conducting more effective evaluations, ultimately improving research in misinformation detection. All datasets and other artifacts are available at https://misinfo-datasets.complexdatalab.com/.
Deploying Large Language Models With Retrieval Augmented Generation
Prabhune, Sonal, Berndt, Donald J.
Knowing that the generative capabilities of large language models (LLM) are sometimes hampered by tendencies to hallucinate or create non-factual responses, researchers have increasingly focused on methods to ground generated outputs in factual data. Retrieval Augmented Generation (RAG) has emerged as a key approach for integrating knowledge from data sources outside of the LLM's training set, including proprietary and up-to-date information. While many research papers explore various RAG strategies, their true efficacy is tested in real-world applications with actual data. The journey from conceiving an idea to actualizing it in the real world is a lengthy process. We present insights from the development and field-testing of a pilot project that integrates LLMs with RAG for information retrieval. Additionally, we examine the impacts on the information value chain, encompassing people, processes, and technology. Our aim is to identify the opportunities and challenges of implementing this emerging technology, particularly within the context of behavioral research in the information systems (IS) field. The contributions of this work include the development of best practices and recommendations for adopting this promising technology while ensuring compliance with industry regulations through a proposed AI governance model.
MissionGPT: Mission Planner for Mobile Robot based on Robotics Transformer Model
Berman, Vladimir, Bazhenov, Artem, Tsetserukou, Dzmitry
This paper presents a novel approach to building mission planners based on neural networks with Transformer architecture and Large Language Models (LLMs). This approach demonstrates the possibility of setting a task for a mobile robot and its successful execution without the use of perception algorithms, based only on the data coming from the camera. In this work, a success rate of more than 50\% was obtained for one of the basic actions for mobile robots. The proposed approach is of practical importance in the field of warehouse logistics robots, as in the future it may allow to eliminate the use of markings, LiDARs, beacons and other tools for robot orientation in space. In conclusion, this approach can be scaled for any type of robot and for any number of robots.