AITopics | Pikuliak, Matúš

Collaborating Authors

Pikuliak, Matúš

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generative Large Language Models in Automated Fact-Checking: A Survey

Vykopal, Ivan, Pikuliak, Matúš, Ostermann, Simon, Šimko, Marián

arXiv.org Artificial IntelligenceJul-2-2024

The dissemination of false information across online platforms poses a serious societal challenge, necessitating robust measures for information verification. While manual fact-checking efforts are still instrumental, the growing volume of false information requires automated methods. Large language models (LLMs) offer promising opportunities to assist fact-checkers, leveraging LLM's extensive knowledge and robust reasoning capabilities. In this survey paper, we investigate the utilization of generative LLMs in the realm of fact-checking, illustrating various approaches that have been employed and techniques for prompting or fine-tuning LLMs. By providing an overview of existing approaches, this survey aims to improve the understanding of utilizing LLMs in fact-checking and to facilitate further progress in LLMs' involvement in this process.

classification, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2407.02351

Country:

Europe (1.00)
Asia > Middle East > UAE (0.14)
North America > United States > New Mexico (0.14)
North America > United States > Louisiana (0.14)

Genre: Overview (1.00)

Industry:

Health & Medicine (1.00)
Media > News (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling

Pikuliak, Matúš, Hrckova, Andrea, Oresko, Stefan, Šimko, Marián

arXiv.org Artificial IntelligenceNov-30-2023

We present GEST -- a new dataset for measuring gender-stereotypical reasoning in masked LMs and English-to-X machine translation systems. GEST contains samples that are compatible with 9 Slavic languages and English for 16 gender stereotypes about men and women (e.g., Women are beautiful, Men are leaders). The definition of said stereotypes was informed by gender experts. We used GEST to evaluate 11 masked LMs and 4 machine translation systems. We discovered significant and consistent amounts of stereotypical reasoning in almost all the evaluated models and languages.

artificial intelligence, natural language, stereotype, (16 more...)

arXiv.org Artificial Intelligence

2311.18711

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Disinformation Capabilities of Large Language Models

Vykopal, Ivan, Pikuliak, Matúš, Srba, Ivan, Moro, Robert, Macko, Dominik, Bielikova, Maria

arXiv.org Artificial IntelligenceNov-15-2023

Automated disinformation generation is often listed as one of the risks of large language models (LLMs). The theoretical ability to flood the information space with disinformation content might have dramatic consequences for democratic societies around the world. This paper presents a comprehensive study of the disinformation capabilities of the current generation of LLMs to generate false news articles in English language. In our study, we evaluated the capabilities of 10 LLMs using 20 disinformation narratives. We evaluated several aspects of the LLMs: how well they are at generating news articles, how strongly they tend to agree or disagree with the disinformation narratives, how often they generate safety warnings, etc. We also evaluated the abilities of detection models to detect these articles as LLM-generated. We conclude that LLMs are able to generate convincing news articles that agree with dangerous disinformation narratives.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2311.08838

Country:

North America > United States (1.00)
Europe > Ukraine (1.00)
Asia > Russia (0.93)

Genre: Research Report > New Finding (0.88)

Industry:

Media > News (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark

Macko, Dominik, Moro, Robert, Uchendu, Adaku, Lucas, Jason Samuel, Yamashita, Michiharu, Pikuliak, Matúš, Srba, Ivan, Le, Thai, Lee, Dongwon, Simko, Jakub, Bielikova, Maria

arXiv.org Artificial IntelligenceOct-20-2023

There is a lack of research into capabilities of recent LLMs to generate convincing text in languages other than English and into performance of detectors of machine-generated text in multilingual settings. This is also reflected in the available benchmarks which lack authentic texts in languages other than English and predominantly cover older generators. To fill this gap, we introduce MULTITuDE, a novel benchmarking dataset for multilingual machine-generated text detection comprising of 74,081 authentic and machine-generated texts in 11 languages (ar, ca, cs, de, en, es, nl, pt, ru, uk, and zh) generated by 8 multilingual LLMs. Using this benchmark, we compare the performance of zero-shot (statistical and black-box) and fine-tuned detectors. Considering the multilinguality, we evaluate 1) how these detectors generalize to unseen languages (linguistically similar as well as dissimilar) and unseen LLMs and 2) whether the detectors improve their performance when trained on multiple languages.

detector, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2310.13606

Country:

Europe (0.67)
North America > United States > Pennsylvania (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry: Media > News (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

Multilingual Previously Fact-Checked Claim Retrieval

Pikuliak, Matúš, Srba, Ivan, Moro, Robert, Hromadka, Timo, Smolen, Timotej, Melisek, Martin, Vykopal, Ivan, Simko, Jakub, Podrouzek, Juraj, Bielikova, Maria

arXiv.org Artificial IntelligenceOct-13-2023

Fact-checkers are often hampered by the sheer amount of online content that needs to be fact-checked. NLP can help them by retrieving already existing fact-checks relevant to the content being investigated. This paper introduces a new multilingual dataset -- MultiClaim -- for previously fact-checked claim retrieval. We collected 28k posts in 27 languages from social media, 206k fact-checks in 39 languages written by professional fact-checkers, as well as 31k connections between these two groups. This is the most extensive and the most linguistically diverse dataset of this kind to date. We evaluated how different unsupervised methods fare on this dataset and its various dimensions. We show that evaluating such a diverse dataset has its complexities and proper care needs to be taken before interpreting the results. We also evaluated a supervised fine-tuning approach, improving upon the unsupervised method significantly.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2305.07991

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Law (1.00)
Media (0.93)
Information Technology > Services (0.92)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

In-Depth Look at Word Filling Societal Bias Measures

Pikuliak, Matúš, Beňová, Ivana, Bachratý, Viktor

arXiv.org Artificial IntelligenceFeb-24-2023

We propose a way to improve the methodologies by introducing a new Language models (LMs) are ubiquitous in current score definition. During experiments we introduce NLP and have brought undeniable performance improvements several new variants of the existing datasets and for many tasks. Concerns have been a completely new dataset in Slovak. These new raised about the fairness of these models (Blodgett datasets are used to compare the expected behavior et al., 2020; Shah et al., 2020; Dev et al., 2021b). of the LMs with their actual behavior. Since LMs are usually trained with web-based text Our results challenge the validity of previous corpora generated by a general population, there studies.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2302.1264

Country:

Europe (0.93)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota (0.14)
(2 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

Add feedback

Average Is Not Enough: Caveats of Multilingual Evaluation

Pikuliak, Matúš, Šimko, Marián

arXiv.org Artificial IntelligenceJan-3-2023

We believe that this to improvements of various multilingual technologies, is an often overlooked tool in our research toolkit such as machine translation (Arivazhagan that should be used more to ensure that we are et al., 2019), multilingual language models (Devlin able to properly interpret results from multilingual et al., 2019; Conneau and Lample, 2019), crosslingual evaluation and detect various linguistic biases and transfer learning (Pikuliak et al., 2021) or problems. In addition to this discussion, which language independent representations (Ruder et al., we consider a contribution in itself, we also propose 2019). It is now possible to create well-performing a visualization based on URIEL typological multilingual methods for many tasks. When dealing database (Littell et al., 2017) as an example of such with multilingual methods, we need to be able qualitative analysis, and we show that it is able to to evaluate how good they really are, i.e. how effective discover linguistic biases in published results.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2301.01269

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Cross-Lingual Learning With Distributed Representations

Pikuliak, Matúš (Slovak University of Technology in Bratislava)

AAAI ConferencesFeb-8-2018

Cross-lingual Learning can help to bring state-of-the-art deep learning solutions to smaller languages. These languages in general lack resource for training advanced neural networks. With transfer of knowledge across languages we can improve the results for various NLP tasks.

deep learning, neural network, representation, (21 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Europe > Germany (0.16)
Europe > Slovakia (0.15)
North America > United States > Maryland (0.15)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback