AITopics | swissbert

Collaborating Authors

swissbert

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Fine-tuning the SwissBERT Encoder Model for Embedding Sentences and Documents

Grosjean, Juri, Vamvas, Jannis

arXiv.org Artificial IntelligenceMay-13-2024

Encoder models trained for the embedding of sentences or short documents have proven useful for tasks such as semantic search and topic modeling. In this paper, we present a version of the SwissBERT encoder model that we specifically fine-tuned for this purpose. SwissBERT contains language adapters for the four national languages of Switzerland -- German, French, Italian, and Romansh -- and has been pre-trained on a large number of news articles in those languages. Using contrastive learning based on a subset of these articles, we trained a fine-tuned version, which we call SentenceSwissBERT. Multilingual experiments on document retrieval and text classification in a Switzerland-specific setting show that SentenceSwissBERT surpasses the accuracy of the original SwissBERT model and of a comparable baseline. The model is openly available for research use.

dataset, proceedings, swissbert, (15 more...)

arXiv.org Artificial Intelligence

2405.07513

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Switzerland > Neuchâtel > Neuchâtel (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.35)

Add feedback

Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect

Vamvas, Jannis, Aepli, Noëmi, Sennrich, Rico

arXiv.org Artificial IntelligenceJan-25-2024

Creating neural text encoders for written Swiss German is challenging due to a dearth of training data combined with dialectal variation. In this paper, we build on several existing multilingual encoders and adapt them to Swiss German using continued pre-training. Evaluation on three diverse downstream tasks shows that simply adding a Swiss German adapter to a modular encoder achieves 97.5% of fully monolithic adaptation performance. We further find that for the task of retrieving Swiss German sentences given Standard German queries, adapting a character-level model is more effective than the other adaptation strategies. We release our code and the models trained for our experiments at https://github.com/ZurichNLP/swiss-german-text-encoders

adapter, computational linguistic, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2401.144

Country:

Europe > Switzerland > Zürich > Zürich (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(13 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

SCALE: Scaling up the Complexity for Advanced Language Model Evaluation

Rasiah, Vishvaksenan, Stern, Ronja, Matoshi, Veton, Stürmer, Matthias, Chalkidis, Ilias, Ho, Daniel E., Niklaus, Joel

arXiv.org Artificial IntelligenceSep-1-2023

Recent strides in Large Language Models (LLMs) have saturated many NLP benchmarks (even professional domain-specific ones), emphasizing the need for novel, more challenging novel ones to properly assess LLM capabilities. In this paper, we introduce a novel NLP benchmark that poses challenges to current LLMs across four key dimensions: processing long documents (up to 50K tokens), utilizing domain specific knowledge (embodied in legal texts), multilingual understanding (covering five languages), and multitasking (comprising legal document to document Information Retrieval, Court View Generation, Leading Decision Summarization, Citation Extraction, and eight challenging Text Classification tasks). Our benchmark comprises diverse legal NLP datasets from the Swiss legal system, allowing for a comprehensive study of the underlying Non-English, inherently multilingual, federal legal system. Despite recent advances, efficiently processing long documents for intense review/analysis tasks remains an open challenge for language models. Also, comprehensive, domain-specific benchmarks requiring high expertise to develop are rare, as are multilingual benchmarks. This scarcity underscores our contribution's value, considering most public models are trained predominantly on English corpora, while other languages remain understudied, particularly for practical domain-specific NLP tasks. Our benchmark allows for testing and advancing the state-of-the-art LLMs. As part of our study, we evaluate several pre-trained multilingual language models on our benchmark to establish strong baselines as a point of reference. Despite the large size of our datasets (tens to hundreds of thousands of examples), existing publicly available models struggle with most tasks, even after in-domain pretraining. We publish all resources (benchmark suite, pre-trained models, code) under a fully permissive open CC BY-SA license.

citation extraction, federal supreme court decision, international joint conference, (13 more...)

arXiv.org Artificial Intelligence

2306.09237

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany (0.14)
Europe > France (0.14)
(29 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Law > Criminal Law (0.92)
Government > Regional Government > Europe Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SwissBERT: The Multilingual Language Model for Switzerland

Vamvas, Jannis, Graën, Johannes, Sennrich, Rico

arXiv.org Artificial IntelligenceJun-12-2023

We present SwissBERT, a masked language model created specifically for processing Switzerland-related text. SwissBERT is a pre-trained model that we adapted to news articles written in the national languages of Switzerland -- German, French, Italian, and Romansh. We evaluate SwissBERT on natural language understanding tasks related to Switzerland and find that it tends to outperform previous models on these tasks, especially when processing contemporary news and/or Romansh Grischun. Since SwissBERT uses language adapters, it may be extended to Swiss German dialects in future work. The model and our open-source code are publicly released at https://github.com/ZurichNLP/swissbert.

artificial intelligence, natural language, text processing, (19 more...)

arXiv.org Artificial Intelligence

2303.1331

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry: Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)

Add feedback