AITopics | Sifa, Rafet

Collaborating Authors

Sifa, Rafet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

[Vision Paper] PRObot: Enhancing Patient-Reported Outcome Measures for Diabetic Retinopathy using Chatbots and Generative AI

Pielka, Maren, Schneider, Tobias, Terheyden, Jan, Sifa, Rafet

arXiv.org Artificial IntelligenceNov-5-2024

We present an outline of the first large language model (LLM) based chatbot application in the context of patient-reported outcome measures (PROMs) for diabetic retinopathy. By utilizing the capabilities of current LLMs, we enable patients to provide feedback about their quality of life and treatment progress via an interactive application. The proposed framework offers significant advantages over the current approach, which encompasses only qualitative collection of survey data or a static survey with limited answer options. Using the PROBot LLM-PROM application, patients will be asked tailored questions about their individual challenges, and can give more detailed feedback on the progress of their treatment. Based on this input, we will use machine learning to infer conventional PROM scores, which can be used by clinicians to evaluate the treatment status. The goal of the application is to improve adherence to the healthcare system and treatments, and thus ultimately reduce cases of subsequent vision impairment. The approach needs to be further validated using a survey and a clinical study.

large language model, machine learning, prom, (17 more...)

arXiv.org Artificial Intelligence

2411.02973

Country: Europe > Germany > North Rhine-Westphalia (0.14)

Genre:

Questionnaire & Opinion Survey (0.69)
Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.64)

Add feedback

Informed Deep Abstaining Classifier: Investigating noise-robust training for diagnostic decision support systems

Schneider, Helen, Nowak, Sebastian, Parikh, Aditya, Layer, Yannik C., Theis, Maike, Block, Wolfgang, Sprinkart, Alois M., Attenberger, Ulrike, Sifa, Rafet

arXiv.org Artificial IntelligenceOct-28-2024

Image-based diagnostic decision support systems (DDSS) utilizing deep learning have the potential to optimize clinical workflows. However, developing DDSS requires extensive datasets with expert annotations and is therefore costly. Leveraging report contents from radiological data bases with Natural Language Processing to annotate the corresponding image data promises to replace labor-intensive manual annotation. As mining "real world" databases can introduce label noise, noise-robust training losses are of great interest. However, current noise-robust losses do not consider noise estimations that can for example be derived based on the performance of the automatic label generator used. In this study, we expand the noise-robust Deep Abstaining Classifier (DAC) loss to an Informed Deep Abstaining Classifier (IDAC) loss by incorporating noise level estimations during training. Our findings demonstrate that IDAC enhances the noise robustness compared to DAC and several state-of-the-art loss functions. The results are obtained on various simulated noise levels using a public chest X-ray data set. These findings are reproduced on an in-house noisy data set, where labels were extracted from the clinical systems of the University Hospital Bonn by a text-based transformer. The IDAC can therefore be a valuable tool for researchers, companies or clinics aiming to develop accurate and reliable DDSS from routine clinical data.

decision support system, machine learning, noise level, (17 more...)

arXiv.org Artificial Intelligence

2410.21014

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Nuclear Medicine (0.89)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs

Ali, Mehdi, Fromm, Michael, Thellmann, Klaudia, Ebert, Jan, Weber, Alexander Arno, Rutmann, Richard, Jain, Charvi, Lübbering, Max, Steinigen, Daniel, Leveling, Johannes, Klug, Katrin, Buschhoff, Jasper Schulze, Jurkschat, Lena, Abdelwahab, Hammam, Stein, Benny Jörg, Sylla, Karl-Heinz, Denisov, Pavel, Brandizzi, Nicolo', Saleem, Qasid, Bhowmick, Anirban, Helmer, Lennard, John, Chelsea, Suarez, Pedro Ortiz, Ostendorff, Malte, Jude, Alex, Manjunath, Lalith, Weinbach, Samuel, Penke, Carolin, Filatov, Oleg, Asaadi, Shima, Barth, Fabio, Sifa, Rafet, Küch, Fabian, Herten, Andreas, Jäkel, René, Rehm, Georg, Kesselheim, Stefan, Köhler, Joachim, Flores-Herr, Nicolas

arXiv.org Artificial IntelligenceOct-15-2024

We present two multilingual LLMs designed to embrace Europe's linguistic diversity by supporting all 24 official languages of the European Union. Trained on a dataset comprising around 60% non-English data and utilizing a custom multilingual tokenizer, our models address the limitations of existing LLMs that predominantly focus on English or a few high-resource languages. We detail the models' development principles, i.e., data composition, tokenizer optimization, and training methodologies. The models demonstrate competitive performance across multilingual benchmarks, as evidenced by their performance on European versions of ARC, HellaSwag, MMLU, and TruthfulQA.

large language model, machine learning, meta-llama-3, (15 more...)

arXiv.org Artificial Intelligence

2410.0373

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > Europe Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness

Hillebrand, Lars, Pradhan, Prabhupad, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceJun-6-2024

We introduce "pointer-guided segment ordering" (SO), a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations in large language models. Our methodology leverages a self-attention-driven pointer network to restore the original sequence of shuffled text segments, addressing the challenge of capturing the structural coherence and contextual dependencies within documents. This pre-training approach is complemented by a fine-tuning methodology that incorporates dynamic sampling, augmenting the diversity of training instances and improving sample efficiency for various downstream applications. We evaluate our method on a diverse set of datasets, demonstrating its efficacy in tasks requiring sequential text classification across scientific literature and financial reporting domains. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures, leading to state-of-the-art performance in downstream classification tasks.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.04156

Country: Europe > Germany > North Rhine-Westphalia (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SugarViT -- Multi-objective Regression of UAV Images with Vision Transformers and Deep Label Distribution Learning Demonstrated on Disease Severity Prediction in Sugar Beet

Günder, Maurice, Yamati, Facundo Ramón Ispizua, Alcántara, Abel Andree Barreto, Mahlein, Anne-Katrin, Sifa, Rafet, Bauckhage, Christian

arXiv.org Artificial IntelligenceFeb-1-2024

Remote sensing and artificial intelligence are pivotal technologies of precision agriculture nowadays. The efficient retrieval of large-scale field imagery combined with machine learning techniques shows success in various tasks like phenotyping, weeding, cropping, and disease control. This work will introduce a machine learning framework for automatized large-scale plant-specific trait annotation for the use case disease severity scoring for Cercospora Leaf Spot (CLS) in sugar beet. With concepts of Deep Label Distribution Learning (DLDL), special loss functions, and a tailored model architecture, we develop an efficient Vision Transformer based model for disease severity scoring called SugarViT. One novelty in this work is the combination of remote sensing data with environmental parameters of the experimental sites for disease severity prediction. Although the model is evaluated on this special use case, it is held as generic as possible to also be applicable to various image-based classification and regression tasks. With our framework, it is even possible to learn models on multi-objective problems as we show by a pretraining on environmental metadata.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2311.03076

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.82)

Add feedback

Generating Prototypes for Contradiction Detection Using Large Language Models and Linguistic Rules

Pielka, Maren, Schmidt, Svetlana, Sifa, Rafet

arXiv.org Artificial IntelligenceOct-23-2023

Detecting contradictions in text is one of the hardest tasks The intuition is to use linguistic and factual rules where for a language model to comprehend. This is due to the this is applicable, i.e. for contradictions based on antonymy, complex semantic nature of contradictions, and the variety of negations and numeric mismatches. For more complex relations contexts in which they can occur. For this reason, a multitude such as factive or structural contradictions, we instruct of data sets and models have been developed to solve this a generative model to produce new samples, either based task. Meanwhile, the recent onset of large generative language on given premises or on the type description alone. To our models has given rise to new possibilities for problem solving knowledge, this is the first work implementing such a hybrid as well as data augmentation, which we aim to explore in this data generation method with respect to NLI.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2310.14732

Country:

North America > United States (0.29)
Europe > Germany (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Controlled Randomness Improves the Performance of Transformer Models

Deußer, Tobias, Zhao, Cong, Krämer, Wolfgang, Leonhard, David, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceOct-20-2023

The emergence of pre-trained transformer models brought a massive breakthrough in the field of natural language processing. During pre-training, such transformer models can learn generic language representations with strong generalization capabilities by applying a self-supervised learning approach and leveraging large text corpora. These pretrained language models can be fine-tuned in various downstream tasks without needing to train from scratch compared to traditional training methods, significantly reducing training costs while achieving excellent performance. Models like BERT Devlin et al. (2019), ELECTRA Clark et al. (2020), or T5 Raffel et al. (2020) have achieved remarkable results on several language processing tasks and the most recent developments of even larger language models, made prominent by GPT-3 Brown et al. (2020) and GPT-4 OpenAI (2023) but not limited to these two

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.13526

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tokenizer Choice For LLM Training: Negligible or Crucial?

Ali, Mehdi, Fromm, Michael, Thellmann, Klaudia, Rutmann, Richard, Lübbering, Max, Leveling, Johannes, Klug, Katrin, Ebert, Jan, Doll, Niclas, Buschhoff, Jasper Schulze, Jain, Charvi, Weber, Alexander Arno, Jurkschat, Lena, Abdelwahab, Hammam, John, Chelsea, Suarez, Pedro Ortiz, Ostendorff, Malte, Weinbach, Samuel, Sifa, Rafet, Kesselheim, Stefan, Flores-Herr, Nicolas

arXiv.org Artificial IntelligenceOct-18-2023

The recent success of LLMs has been predominantly driven by curating the training dataset composition, scaling of model architectures and dataset sizes and advancements in pretraining objectives, leaving tokenizer influence as a blind spot. Shedding light on this underexplored area, we conduct a comprehensive study on the influence of tokenizer choice on LLM downstream performance by training 24 mono- and multilingual LLMs at a 2.6B parameter scale, ablating different tokenizer algorithms and parameterizations. Our studies highlight that the tokenizer choice can significantly impact the model's downstream performance, training and inference costs. In particular, we find that the common tokenizer evaluation metrics fertility and parity are not always predictive of model downstream performance, rendering these metrics a questionable proxy for the model's downstream performance. Furthermore, we show that multilingual tokenizers trained on the five most frequent European languages require vocabulary size increases of factor three in comparison to English. While English-only tokenizers have been applied to the training of multi-lingual LLMs, we find that this approach results in a severe downstream performance degradation and additional training costs of up to 68%, due to an inefficient tokenization vocabulary.

large language model, natural language, tokenizer choice, (2 more...)

arXiv.org Artificial Intelligence

2310.08754

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Informed Named Entity Recognition Decoding for Generative Language Models

Deußer, Tobias, Hillebrand, Lars, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceAug-15-2023

Ever-larger language models with ever-increasing capabilities are by now well-established text processing tools. Alas, information extraction tasks such as named entity recognition are still largely unaffected by this progress as they are primarily based on the previous generation of encoder-only transformer models. Here, we propose a simple yet effective approach, Informed Named Entity Recognition Decoding (iNERD), which treats named entity recognition as a generative process. It leverages the language understanding capabilities of recent generative models in a future-proof manner and employs an informed decoding scheme incorporating the restricted nature of information extraction into open-ended text generation, improving performance and eliminating any risk of hallucinations. We coarse-tune our model on a merged named entity corpus to strengthen its performance, evaluate five generative language models on eight named entity recognition datasets, and achieve remarkable results, especially in an environment with an unknown entity class set, demonstrating the adaptability of the approach.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2308.07791

Country:

Europe > Germany (0.28)
North America (0.28)

Genre: Research Report (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models

Hillebrand, Lars, Berger, Armin, Deußer, Tobias, Dilmaghani, Tim, Khaled, Mohamed, Kliem, Bernd, Loitz, Rüdiger, Pielka, Maren, Leonhard, David, Bauckhage, Christian, Sifa, Rafet

arXiv.org Artificial IntelligenceAug-14-2023

Auditing financial documents is a very tedious and time-consuming process. As of today, it can already be simplified by employing AI-based solutions to recommend relevant text passages from a report for each legal requirement of rigorous accounting standards. However, these methods need to be fine-tuned regularly, and they require abundant annotated data, which is often lacking in industrial environments. Hence, we present ZeroShotALI, a novel recommender system that leverages a state-of-the-art large language model (LLM) in conjunction with a domain-specifically optimized transformer-based text-matching solution. We find that a two-step approach of first retrieving a number of best matching document sections per legal requirement with a custom BERT-based model and second filtering these selections using an LLM yields significant performance improvements over existing approaches.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.06111

Country: Europe > Ireland (0.16)

Genre: Research Report > New Finding (0.46)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback