Friesland
- Europe > Netherlands > North Holland > Amsterdam (0.12)
- Europe > Netherlands > South Holland > Rotterdam (0.05)
- Europe > Netherlands > Gelderland > Nijmegen (0.05)
- (10 more...)
9 Appendix Supplementary material for the paper Causal analysis of 19 spread in Germany
W in V, W is independent of V\ ( Descendants(W) Parents( W)) given Parents (W) . As expected we see that the number of detected causes by Granger is multiple times more than those of SyPI; in most cases Granger detects as causes all the candidate states. On the other hand, SyPI does not suffer from such problems even when there are latent confounders. Finally, in the third column, we report the detected distant causes. Strict thresholds (the default of SyPI method) are used for the analysis.
- Europe > Germany > Berlin (0.15)
- Europe > Germany > Schleswig-Holstein (0.08)
- Europe > Germany > Mecklenburg-Vorpommern (0.06)
- (26 more...)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Europe > Germany > Schleswig-Holstein (0.04)
- (31 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.10)
- Europe > Netherlands > South Holland > Rotterdam (0.05)
- (16 more...)
Inferring Adjective Hypernyms with Language Models to Increase the Connectivity of Open English Wordnet
Augello, Lorenzo, McCrae, John P.
Open English Wordnet is a key resource published in OntoLex-lemon as part of the linguistic linked open data cloud. There are, however, many links missing in the resource, and in this paper, we look at how we can establish hypernymy between adjectives. We present a theoretical discussion of the hypernymy relation and how it differs for adjectives in contrast to nouns and verbs. We develop a new resource for adjective hypernymy and fine-tune large language models to predict adjective hypernymy, showing that the methodology of TaxoLLaMa can be adapted to this task.
- North America > Dominican Republic (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > Czechia > Prague (0.04)
- (14 more...)
Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance
Amooie, Reihaneh, de Vries, Wietse, Hao, Yun, Dijkstra, Jelske, Coler, Matt, Wieling, Martijn
Automatic Speech Recognition (ASR) performance for low-resource languages is still far behind that of higher-resource languages such as English, due to a lack of sufficient labeled data. State-of-the-art methods deploy self-supervised transfer learning where a model pre-trained on large amounts of data is fine-tuned using little labeled data in a target low-resource language. In this paper, we present and examine a method for fine-tuning an SSL-based model in order to improve the performance for Frisian and its regional dialects (Clay Frisian, Wood Frisian, and South Frisian). We show that Frisian ASR performance can be improved by using multilingual (Frisian, Dutch, English and German) fine-tuning data and an auxiliary language identification task. In addition, our findings show that performance on dialectal speech suffers substantially, and, importantly, that this effect is moderated by the elicitation approach used to collect the dialectal data. Our findings also particularly suggest that relying solely on standard language data for ASR evaluation may underestimate real-world performance, particularly in languages with substantial dialectal variation.
AMuSeD: An Attentive Deep Neural Network for Multimodal Sarcasm Detection Incorporating Bi-modal Data Augmentation
Gao, Xiyuan, Bansal, Shubhi, Gowda, Kushaan, Li, Zhu, Nayak, Shekhar, Kumar, Nagendra, Coler, Matt
Detecting sarcasm effectively requires a nuanced understanding of context, including vocal tones and facial expressions. The progression towards multimodal computational methods in sarcasm detection, however, faces challenges due to the scarcity of data. To address this, we present AMuSeD (Attentive deep neural network for MUltimodal Sarcasm dEtection incorporating bi-modal Data augmentation). This approach utilizes the Multimodal Sarcasm Detection Dataset (MUStARD) and introduces a two-phase bimodal data augmentation strategy. The first phase involves generating varied text samples through Back Translation from several secondary languages. The second phase involves the refinement of a FastSpeech 2-based speech synthesis system, tailored specifically for sarcasm to retain sarcastic intonations. Alongside a cloud-based Text-to-Speech (TTS) service, this Fine-tuned FastSpeech 2 system produces corresponding audio for the text augmentations. We also investigate various attention mechanisms for effectively merging text and audio data, finding self-attention to be the most efficient for bimodal integration. Our experiments reveal that this combined augmentation and attention approach achieves a significant F1-score of 81.0% in text-audio modalities, surpassing even models that use three modalities from the MUStARD dataset.
- Asia > Singapore (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (27 more...)
NERsocial: Efficient Named Entity Recognition Dataset Construction for Human-Robot Interaction Utilizing RapidNER
Atuhurra, Jesse, Kamigaito, Hidetaka, Ouchi, Hiroki, Shindo, Hiroyuki, Watanabe, Taro
Adapting named entity recognition (NER) methods to new domains poses significant challenges. We introduce RapidNER, a framework designed for the rapid deployment of NER systems through efficient dataset construction. RapidNER operates through three key steps: (1) extracting domain-specific sub-graphs and triples from a general knowledge graph, (2) collecting and leveraging texts from various sources to build the NERsocial dataset, which focuses on entities typical in human-robot interaction, and (3) implementing an annotation scheme using Elasticsearch (ES) to enhance efficiency. NERsocial, validated by human annotators, includes six entity types, 153K tokens, and 99.4K sentences, demonstrating RapidNER's capability to expedite dataset creation.
- North America > United States > Virginia (0.04)
- Asia > India (0.04)
- South America > Peru (0.04)
- (33 more...)
- Media > News (1.00)
- Media > Music (1.00)
- Leisure & Entertainment > Sports > Motorsports (1.00)
- (14 more...)
Reveal the Unknown: Out-of-Knowledge-Base Mention Discovery with Entity Linking
Dong, Hang, Chen, Jiaoyan, He, Yuan, Liu, Yinan, Horrocks, Ian
Discovering entity mentions that are out of a Knowledge Base (KB) from texts plays a critical role in KB maintenance, but has not yet been fully explored. The current methods are mostly limited to the simple threshold-based approach and feature-based classification, and the datasets for evaluation are relatively rare. We propose BLINKout, a new BERT-based Entity Linking (EL) method which can identify mentions that do not have corresponding KB entities by matching them to a special NIL entity. To better utilize BERT, we propose new techniques including NIL entity representation and classification, with synonym enhancement. We also apply KB Pruning and Versioning strategies to automatically construct out-of-KB datasets from common in-KB EL datasets. Results on five datasets of clinical notes, biomedical publications, and Wikipedia articles in various domains show the advantages of BLINKout over existing methods to identify out-of-KB mentions for the medical ontologies, UMLS, SNOMED CT, and the general KB, WikiData.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > West Midlands > Birmingham (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (20 more...)
Explainable Contextual Anomaly Detection using Quantile Regression Forests
Li, Zhong, van Leeuwen, Matthijs
Chandola et al (2009) subdivided anomalies into three types: point anomalies (an object is considered anomalous when compared against the rest of objects), contextual anomalies (an object is anomalous in a specific context), and collective anomalies (a collection of objects is anomalous with respect to the entire dataset). The analysis of anomalies has a wide range of applications, such as in network security (Ahmed et al, 2016a), bioinformatics (Spinosa and Carvalho, 2005), fraud detection (Ahmed et al, 2016b), and fault detection and isolation (Hwang et al, 2009). Anomaly analysis consists of two equally important tasks: anomaly detection and anomaly explanation. A wealth of'shallow' machine learning based methods, i.e., not based on deep learning, have been proposed to detect anomalies (Chandola et al, 2009). More recently, many deep learning based anomaly detection methods have also been developed (Pang et al, 2021). However, deep learning based anomaly detection methods are notoriously known as not being interpretable, in the sense that generally both the model itself is non-transparent and the resulting anomaly scores are challenging to interpret without the use of a post-hoc explainer.
- Europe > Netherlands > South Holland > Leiden (0.04)
- Europe > Netherlands > South Holland > Rotterdam (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (9 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Leisure & Entertainment > Sports (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Information Technology (0.68)
- Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)