Goto

Collaborating Authors

 qualification


Enhancing Job Matching: Occupation, Skill and Qualification Linking with the ESCO and EQF taxonomies

Saroglou, Stylianos, Diamantaras, Konstantinos, Preta, Francesco, Delianidi, Marina, Benisis, Apostolos, Meyer, Christian Johannes

arXiv.org Artificial Intelligence

This study investigates the potential of language models to improve the classification of labor market information by linking job vacancy texts to two major European frameworks: the European Skills, Competences, Qualifications and Occupations (ESCO) taxonomy and the European Qualifications Framework (EQF). We examine and compare two prominent methodologies from the literature: Sentence Linking and Entity Linking. In support of ongoing research, we release an open-source tool, incorporating these two methodologies, designed to facilitate further work on labor classification and employment discourse. To move beyond surface-level skill extraction, we introduce two annotated datasets specifically aimed at evaluating how occupations and qualifications are represented within job vacancy texts. Additionally, we examine different ways to utilize generative large language models for this task. Our findings contribute to advancing the state of the art in job entity extraction and offer computational infrastructure for examining work, skills, and labor market narratives in a digitally mediated economy. Our code is made publicly available: https://github.com/tabiya-tech/tabiya-livelihoods-classifier


Causal Synthetic Data Generation in Recruitment

Iommi, Andrea, Mastropietro, Antonio, Guidotti, Riccardo, Monreale, Anna, Ruggieri, Salvatore

arXiv.org Artificial Intelligence

The importance of Synthetic Data Generation (SDG) has increased significantly in domains where data quality is poor or access is limited due to privacy and regulatory constraints. One such domain is recruitment, where publicly available datasets are scarce due to the sensitive nature of information typically found in curricula vitae, such as gender, disability status, or age. This lack of accessible, representative data presents a significant obstacle to the development of fair and transparent machine learning models, particularly ranking algorithms that require large volumes of data to effectively learn how to recommend candidates. In the absence of such data, these models are prone to poor generalisation and may fail to perform reliably in real-world scenarios. Recent advances in Causal Generative Models (CGMs) offer a promising solution. CGMs enable the generation of synthetic datasets that preserve the underlying causal relationships within the data, providing greater control over fairness and interpretability in the data generation process. In this study, we present a specialised SDG method involving two CGMs: one modelling job offers and the other modelling curricula. Each model is structured according to a causal graph informed by domain expertise. We use these models to generate synthetic datasets and evaluate the fairness of candidate rankings under controlled scenarios that introduce specific biases.



references of the submitted work

Neural Information Processing Systems

We would like to thank all the reviewers for all the suggestive comments. We will stress this important fact. We expect similar results could hold for other loss (e.g. What's new and what's known? How the qualification affects the learning bound?


GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

Patwardhan, Tejal, Dias, Rachel, Proehl, Elizabeth, Kim, Grace, Wang, Michele, Watkins, Olivia, Fishman, Simón Posada, Aljubeh, Marwan, Thacker, Phoebe, Fauconnet, Laurance, Kim, Natalie S., Chao, Patrick, Miserendino, Samuel, Chabot, Gildas, Li, David, Sharman, Michael, Barr, Alexandra, Glaese, Amelia, Tworek, Jerry

arXiv.org Artificial Intelligence

We introduce GDPval, a benchmark evaluating AI model capabilities on real-world economically valuable tasks. GDPval covers the majority of U.S. Bureau of Labor Statistics Work Activities for 44 occupations across the top 9 sectors contributing to U.S. GDP (Gross Domestic Product). Tasks are constructed from the representative work of industry professionals with an average of 14 years of experience. We find that frontier model performance on GDPval is improving roughly linearly over time, and that the current best frontier models are approaching industry experts in deliverable quality. We analyze the potential for frontier models, when paired with human oversight, to perform GDPval tasks cheaper and faster than unaided experts. We also demonstrate that increased reasoning effort, increased task context, and increased scaffolding improves model performance on GDPval. Finally, we open-source a gold subset of 220 tasks and provide a public automated grading service at evals.openai.com to facilitate future research in understanding real-world model capabilities.


A Meta-Analysis of LLM Effects on Students across Qualification, Socialisation, and Subjectification

Huang, Jiayu, Wang, Ruoxin Ritter, Liu, Jen-Hao, Xia, Boming, Huang, Yue, Sun, Ruoxi, Xue, Jason Minhui, Zou, Jinan

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly positioned as solutions for education, yet evaluations often reduce their impact to narrow performance metrics. This paper reframes the question by asking "what kind of impact should LLMs have in education?" Drawing on Biesta's tripartite account of good education: qualification, socialisation, and subjectification, we present a meta-analysis of 133 experimental and quasi-experimental studies (k = 188). Overall, the impact of LLMs on student learning is positive but uneven. Strong effects emerge in qualification, particularly when LLMs function as tutors in sustained interventions. Socialisation outcomes appear more variable, concentrated in sustained, reflective interventions. Subjectification, linked to autonomy and learner development, remains fragile, with improvements confined to small-scale, long-term studies. This purpose-level view highlights design as the decisive factor: without scaffolds for participation and agency, LLMs privilege what is easiest to measure while neglecting broader aims of education. For HCI and education, the issue is not just whether LLMs work, but what futures they enable or foreclose.


Unified Crew Planning and Replanning Optimization in Multi-Line Metro Systems Considering Workforce Heterogeneity

Chen, Qihang

arXiv.org Artificial Intelligence

Abstract--Metro crew planning is a key component of smart city development as it directly impacts the operational efficiency and service reliability of public transportation. With the rapid expansion of metro networks, effective multi-line scheduling and emergency management have become essential for large-scale seamless operations. However, current research focuses primarily on individual metro lines, with insufficient attention on cross-line coordination and rapid replanning during disruptions. Here, a unified optimization framework is presented for multi-line metro crew planning and replanning with heterogeneous workforce. Specifically, a hierarchical time-space network model is proposed to represent the unified crew action space, and computationally efficient constraints and formulations are derived for the crew's heterogeneous qualifications and preferences. Solution algorithms based on column generation and shortest path adjustment are further developed, utilizing the proposed network model. Experiments with real data from Shanghai and Beijing Metro demonstrate that the proposed methods outperform benchmark heuristics in both cost reduction and task completion, and achieve notable efficiency gains by incorporating cross-line operations, particularly for urgent tasks during disruptions. This work highlights the role of global optimization and cross-line coordination in multi-line metro system operations, providing insights into the efficient and reliable functioning of public transportation in smart cities. Metro systems are vital to urban transportation, offering high efficiency and large capacity to meet growing mobility demands. Within the context of metro operations, labor costs account for a significant share of expenses [1]. Consequently, metro crew planning plays a crucial factor in achieving smooth, cost-effective operations. As metro systems continue to expand rapidly, the need for optimized crew planning approaches has become increasingly critical to realize efficient and intelligent metro operations that support the broader goals of smart city development [2]. Existing research on metro crew planning primarily focuses on single-line operations [3], [4], [5], [6], [7], [8].


Ontology-Aligned Embeddings for Data-Driven Labour Market Analytics

Hihn, Heinke, Dittrich, Dennis A. V., Jeske, Carl, Sobral, Cayo Costa, Pais, Helio, Lochmann, Timm

arXiv.org Artificial Intelligence

The limited ability to reason across occupational data from different sources is a long-standing bottleneck for data-driven labour market analytics. Previous research has relied on hand-crafted ontologies that allow such reasoning but are computationally expensive and require careful maintenance by human experts. The rise of language processing machine learning models offers a scalable alternative by learning shared semantic spaces that bridge diverse occupational vocabularies without extensive human curation. We present an embedding-based alignment process that links any free-form German job title to two established ontologies - the German Klassifikation der Berufe and the International Standard Classification of Education. Using publicly available data from the German Federal Employment Agency, we construct a dataset to fine-tune a Sentence-BERT model to learn the structure imposed by the ontologies. The enriched pairs (job title, embedding) define a similarity graph structure that we can use for efficient approximate nearest-neighbour search, allowing us to frame the classification process as a semantic search problem. This allows for greater flexibility, e.g., adding more classes. We discuss design decisions, open challenges, and outline ongoing work on extending the graph with other ontologies and multilingual titles.


Linking heterogeneous microstructure informatics with expert characterization knowledge through customized and hybrid vision-language representations for industrial qualification

Safdar, Mutahar, Wood, Gentry, Zimmermann, Max, Lamouche, Guy, Wanjara, Priti, Zhao, Yaoyao Fiona

arXiv.org Artificial Intelligence

Rapid and reliable qualification of advanced materials remains a bottleneck in industrial manufacturing, particularly for heterogeneous structures produced via non-conventional additive manufacturing processes. This study introduces a novel framework that links microstructure informatics with a range of expert characterization knowledge using customized and hybrid vision-language representations (VLRs). By integrating deep semantic segmentation with pre-trained multi-modal models (CLIP and FLAVA), we encode both visual microstructural data and textual expert assessments into shared representations. To overcome limitations in general-purpose embeddings, we develop a customized similarity-based representation that incorporates both positive and negative references from expert-annotated images and their associated textual descriptions. This allows zero-shot classification of previously unseen microstructures through a net similarity scoring approach. Validation on an additively manufactured metal matrix composite dataset demonstrates the framework's ability to distinguish between acceptable and defective samples across a range of characterization criteria. Comparative analysis reveals that FLAVA model offers higher visual sensitivity, while the CLIP model provides consistent alignment with the textual criteria. Z-score normalization adjusts raw unimodal and cross-modal similarity scores based on their local dataset-driven distributions, enabling more effective alignment and classification in the hybrid vision-language framework. The proposed method enhances traceability and interpretability in qualification pipelines by enabling human-in-the-loop decision-making without task-specific model retraining. By advancing semantic interoperability between raw data and expert knowledge, this work contributes toward scalable and domain-adaptable qualification strategies in engineering informatics.


A-levels and GCSEs need overhaul to keep pace with generative AI, experts say

The Guardian

Oral assessments, more security checks and speedier marking are all on the cards as generative artificial intelligence (AI) could transform exams for the next generation of students. As the 2025 exam season drew to a close with GCSE students picking up their results on Thursday, after mostly sitting traditional pen and paper exams, AI is already changing the landscape. Exam preparation is undergoing a revolution, with students increasingly creating personal AI tutors, available around the clock to generate learning materials to suit individual needs that potentially lead to better results. "Using AI can give a student a much better understanding of a subject because they can ask those questions they wouldn't ask in class, or at odd hours, without being judged," said Dr Andrew Rogoyski of the Surrey Institute for People-Centred AI. "It really took off this summer," said Sandra Leaton Gray, a professor of education futures at University College London's Institute of Education. "So they're able to talk to it about the marking frameworks that are in use and upload those, and then they're able to do sample answers on their own. And then they're able to say to the AI: 'How would you improve the answer?' It's like having a tireless tutor."