Goto

Collaborating Authors

 Kraus, Mathias


CareerBERT: Matching Resumes to ESCO Jobs in a Shared Embedding Space for Generic Job Recommendations

arXiv.org Artificial Intelligence

The rapidly evolving labor market, driven by technological advancements and economic shifts, presents significant challenges for traditional job matching and consultation services. In response, we introduce an advanced support tool for career counselors and job seekers based on CareerBERT, a novel approach that leverages the power of unstructured textual data sources, such as resumes, to provide more accurate and comprehensive job recommendations. In contrast to previous approaches that primarily focus on job recommendations based on a fixed set of concrete job advertisements, our approach involves the creation of a corpus that combines data from the European Skills, Competences, and Occupations (ESCO) taxonomy and EURopean Employment Services (EURES) job advertisements, ensuring an up-to-date and well-defined representation of general job titles in the labor market. Our two-step evaluation approach, consisting of an application-grounded evaluation using EURES job advertisements and a human-grounded evaluation using real-world resumes and Human Resources (HR) expert feedback, provides a comprehensive assessment of CareerBERT's performance. Our experimental results demonstrate that CareerBERT outperforms both traditional and state-of-the-art embedding approaches while showing robust effectiveness in human expert evaluations. These results confirm the effectiveness of CareerBERT in supporting career consultants by generating relevant job recommendations based on resumes, ultimately enhancing the efficiency of job consultations and expanding the perspectives of job seekers. This research contributes to the field of NLP and job recommendation systems, offering valuable insights for both researchers and practitioners in the domain of career consulting and job matching.


Hate Speech and Sentiment of YouTube Video Comments From Public and Private Sources Covering the Israel-Palestine Conflict

arXiv.org Artificial Intelligence

This study explores the prevalence of hate speech (HS) and sentiment in YouTube video comments concerning the Israel-Palestine conflict by analyzing content from both public and private news sources. The research involved annotating 4983 comments for HS and sentiments (neutral, pro-Israel, and pro-Palestine). Subsequently, machine learning (ML) models were developed, demonstrating robust predictive capabilities with area under the receiver operating characteristic (AUROC) scores ranging from 0.83 to 0.90. These models were applied to the extracted comment sections of YouTube videos from public and private sources, uncovering a higher incidence of HS in public sources (40.4%) compared to private sources (31.6%). Sentiment analysis revealed a predominantly neutral stance in both source types, with more pronounced sentiments towards Israel and Palestine observed in public sources. This investigation highlights the dynamic nature of online discourse surrounding the Israel-Palestine conflict and underscores the potential of moderating content in a politically charged environment.


The Impact of Transparency in AI Systems on Users' Data-Sharing Intentions: A Scenario-Based Experiment

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) systems are frequently employed in online services to provide personalized experiences to users based on large collections of data. However, AI systems can be designed in different ways, with black-box AI systems appearing as complex data-processing engines and white-box AI systems appearing as fully transparent data-processors. As such, it is reasonable to assume that these different design choices also affect user perception and thus their willingness to share data. To this end, we conducted a pre-registered, scenario-based online experiment with 240 participants and investigated how transparent and non-transparent data-processing entities influenced data-sharing intentions. Surprisingly, our results revealed no significant difference in willingness to share data across entities, challenging the notion that transparency increases data-sharing willingness. Furthermore, we found that a general attitude of trust towards AI has a significant positive influence, especially in the transparent AI condition, whereas privacy concerns did not significantly affect data-sharing decisions.


Quantifying Visual Properties of GAM Shape Plots: Impact on Perceived Cognitive Load and Interpretability

arXiv.org Artificial Intelligence

Generalized Additive Models (GAMs) offer a balance between performance and interpretability in machine learning. The interpretability aspect of GAMs is expressed through shape plots, representing the model's decision-making process. However, the visual properties of these plots, e.g. number of kinks (number of local maxima and minima), can impact their complexity and the cognitive load imposed on the viewer, compromising interpretability. Our study, including 57 participants, investigates the relationship between the visual properties of GAM shape plots and cognitive load they induce. We quantify various visual properties of shape plots and evaluate their alignment with participants' perceived cognitive load, based on 144 plots. Our results indicate that the number of kinks metric is the most effective, explaining 86.4% of the variance in users' ratings. We develop a simple model based on number of kinks that provides a practical tool for predicting cognitive load, enabling the assessment of one aspect of GAM interpretability without direct user involvement.


Challenging the Performance-Interpretability Trade-off: An Evaluation of Interpretable Machine Learning Models

arXiv.org Artificial Intelligence

Machine learning is permeating every conceivable domain to promote data-driven decision support. The focus is often on advanced black-box models due to their assumed performance advantages, whereas interpretable models are often associated with inferior predictive qualities. More recently, however, a new generation of generalized additive models (GAMs) has been proposed that offer promising properties for capturing complex, non-linear patterns while remaining fully interpretable. To uncover the merits and limitations of these models, this study examines the predictive performance of seven different GAMs in comparison to seven commonly used machine learning models based on a collection of twenty tabular benchmark datasets. To ensure a fair and robust model comparison, an extensive hyperparameter search combined with cross-validation was performed, resulting in 68,500 model runs. In addition, this study qualitatively examines the visual output of the models to assess their level of interpretability. Based on these results, the paper dispels the misconception that only black-box models can achieve high accuracy by demonstrating that there is no strict trade-off between predictive performance and model interpretability for tabular data. Furthermore, the paper discusses the importance of GAMs as powerful interpretable models for the field of information systems and derives implications for future work from a socio-technical perspective.


Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering

arXiv.org Artificial Intelligence

Advances towards more faithful and traceable answers of Large Language Models (LLMs) are crucial for various research and practical endeavors. One avenue in reaching this goal is basing the answers on reliable sources. However, this Evidence-Based QA has proven to work insufficiently with LLMs in terms of citing the correct sources (source quality) and truthfully representing the information within sources (answer attributability). In this work, we systematically investigate how to robustly fine-tune LLMs for better source quality and answer attributability. Specifically, we introduce a data generation pipeline with automated data quality filters, which can synthesize diversified high-quality training and testing data at scale. We further introduce four test sets to benchmark the robustness of fine-tuned specialist models. Extensive evaluation shows that fine-tuning on synthetic data improves performance on both in- and out-of-distribution. Furthermore, we show that data quality, which can be drastically improved by proposed quality filters, matters more than quantity in improving Evidence-Based QA.


A machine learning framework for interpretable predictions in patient pathways: The case of predicting ICU admission for patients with symptoms of sepsis

arXiv.org Artificial Intelligence

Proactive analysis of patient pathways helps healthcare providers anticipate treatment-related risks, identify outcomes, and allocate resources. Machine learning (ML) can leverage a patient's complete health history to make informed decisions about future events. However, previous work has mostly relied on so-called black-box models, which are unintelligible to humans, making it difficult for clinicians to apply such models. Our work introduces PatWay-Net, an ML framework designed for interpretable predictions of admission to the intensive care unit (ICU) for patients with symptoms of sepsis. We propose a novel type of recurrent neural network and combine it with multi-layer perceptrons to process the patient pathways and produce predictive yet interpretable results. We demonstrate its utility through a comprehensive dashboard that visualizes patient health trajectories, predictive outcomes, and associated risks. Our evaluation includes both predictive performance - where PatWay-Net outperforms standard models such as decision trees, random forests, and gradient-boosted decision trees - and clinical utility, validated through structured interviews with clinicians. By providing improved predictive accuracy along with interpretable and actionable insights, PatWay-Net serves as a valuable tool for healthcare decision support in the critical case of patients with symptoms of sepsis.


IGANN Sparse: Bridging Sparsity and Interpretability with Non-linear Insight

arXiv.org Artificial Intelligence

Feature selection is a critical component in predictive analytics that significantly affects the prediction accuracy and interpretability of models. Intrinsic methods for feature selection are built directly into model learning, providing a fast and attractive option for large amounts of data. Machine learning algorithms, such as penalized regression models (e.g., lasso) are the most common choice when it comes to in-built feature selection. However, they fail to capture non-linear relationships, which ultimately affects their ability to predict outcomes in intricate datasets. In this paper, we propose IGANN Sparse, a novel machine learning model from the family of generalized additive models, which promotes sparsity through a non-linear feature selection process during training. This ensures interpretability through improved model sparsity without sacrificing predictive performance. Moreover, IGANN Sparse serves as an exploratory tool for information systems researchers to unveil important non-linear relationships in domains that are characterized by complex patterns. Our ongoing research is directed at a thorough evaluation of the IGANN Sparse model, including user studies that allow to assess how well users of the model can benefit from the reduced number of features. This will allow for a deeper understanding of the interactions between linear vs. non-linear modeling, number of selected features, and predictive performance.


A Globally Convergent Algorithm for Neural Network Parameter Optimization Based on Difference-of-Convex Functions

arXiv.org Artificial Intelligence

We propose an algorithm for optimizing the parameters of single hidden layer neural networks. Specifically, we derive a blockwise difference-of-convex (DC) functions representation of the objective function. Based on the latter, we propose a block coordinate descent (BCD) approach that we combine with a tailored difference-of-convex functions algorithm (DCA). We prove global convergence of the proposed algorithm. Furthermore, we mathematically analyze the convergence rate of parameters and the convergence rate in value (i.e., the training loss). We give conditions under which our algorithm converges linearly or even faster depending on the local shape of the loss function. We confirm our theoretical derivations numerically and compare our algorithm against state-of-the-art gradient-based solvers in terms of both training loss and test loss.


Paradigm Shift in Sustainability Disclosure Analysis: Empowering Stakeholders with CHATREPORT, a Language Model-Based Tool

arXiv.org Artificial Intelligence

This paper introduces a novel approach to enhance Large Language Models (LLMs) with expert knowledge to automate the analysis of corporate sustainability reports by benchmarking them against the Task Force for Climate-Related Financial Disclosures (TCFD) recommendations. Corporate sustainability reports are crucial in assessing organizations' environmental and social risks and impacts. However, analyzing these reports' vast amounts of information makes human analysis often too costly. As a result, only a few entities worldwide have the resources to analyze these reports, which could lead to a lack of transparency. While AI-powered tools can automatically analyze the data, they are prone to inaccuracies as they lack domain-specific expertise. This paper introduces a novel approach to enhance LLMs with expert knowledge to automate the analysis of corporate sustainability reports. We christen our tool CHATREPORT, and apply it in a first use case to assess corporate climate risk disclosures following the TCFD recommendations. CHATREPORT results from collaborating with experts in climate science, finance, economic policy, and computer science, demonstrating how domain experts can be involved in developing AI tools. We make our prompt templates, generated data, and scores available to the public to encourage transparency.