ncc
Mitigating Label Length Bias in Large Language Models
Sanz-Guerrero, Mario, von der Wense, Katharina
Large language models (LLMs) are powerful zero- and few-shot learners. However, when predicting over a set of candidate options, LLMs suffer from label biases, and existing calibration methods overlook biases arising from multi-token class labels. We tackle an issue we call label length bias, where labels of different lengths are treated inconsistently, even after standard length normalization. To mitigate it, we propose normalized contextual calibration (NCC), an effective method that normalizes and calibrates predictions at the full-label level. NCC achieves statistically significant improvements over prior approaches across multiple datasets and models, with gains of up to 10% F1. Moreover, NCC extends bias mitigation to broader tasks such as multiple-choice question answering. Our analysis shows that, when combined with in-context learning, NCC is less sensitive to few-shot example selection, requires fewer examples for competitive performance, and produces more reliable confidence estimates. These findings highlight the importance of mitigating full-label biases to improve the performance and robustness of LLM-based methods, particularly in real-world applications where class labels naturally consist of multiple tokens.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (9 more...)
- Education (0.48)
- Information Technology (0.46)
56f9f88906aebf4ad985aaec7fa01313-AuthorFeedback.pdf
We would correct all these points for any camera ready copy of the manuscript. R2 is correct in suggesting that the ultimate goal of machine learning for healthcare should be explainable models. However, interpretability and explainability need not be mutually exclusive. Accordingly we ran two experiments 1) we applied ICAM on Alzheimer's (unseen) We find R2's request for reporting image generation quality reasonable; although we stress that the objectives of We agree with R1 that more thorough details of the training process should go in the supplement. We also appreciate R1 literature suggestions and request for more benchmarking. NCC(+), it is still worse than V A-GAN and ICAM (see Table 3 in paper).
Annotating Compositionality Scores for Irish Noun Compounds is Hard Work
Walsh, Abigail, Clifford, Teresa, Daly, Emma, Dunne, Jane, Davis, Brian, Cleircín, Gearóid Ó
Noun compounds constitute a challenging construction for NLP applications, given their variability in idiomaticity and interpretation. In this paper, we present an analysis of compound nouns identified in Irish text of varied domains by expert annotators, focusing on compositionality as a key feature, but also domain specificity, as well as familiarity and confidence of the annotator giving the ratings. Our findings and the discussion that ensued contributes towards a greater understanding of how these constructions appear in Irish language, and how they might be treated separately from English noun compounds.
- South America > Colombia > Meta Department > Villavicencio (0.05)
- North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
- North America > United States > Colorado > Denver County > Denver (0.04)
- (5 more...)
Neural Conformal Control for Time Series Forecasting
Li, Ruipu, Rodríguez, Alexander
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments. Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders in an end-to-end manner to further enhance adaptivity. Additionally, our model is designed to enhance the consistency of prediction intervals in different quantiles by integrating monotonicity constraints and leverages data from related tasks to boost few-shot learning performance. Using real-world datasets from epidemics, electric demand, weather, and others, we empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
- North America > United States > Michigan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Germany (0.04)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Epidemiology (1.00)
- Health & Medicine > Public Health (0.68)
Neuromorphic Correlates of Artificial Consciousness
The concept of neural correlates of consciousness (NCC), which suggests that specific neural activities are linked to conscious experiences, has gained widespread acceptance. This acceptance is based on a wealth of evidence from experimental studies, brain imaging techniques such as fMRI and EEG, and theoretical frameworks like integrated information theory (IIT) within neuroscience and the philosophy of mind. This paper explores the potential for artificial consciousness by merging neuromorphic design and architecture with brain simulations. It proposes the Neuromorphic Correlates of Artificial Consciousness (NCAC) as a theoretical framework. While the debate on artificial consciousness remains contentious due to our incomplete grasp of consciousness, this work may raise eyebrows and invite criticism. Nevertheless, this optimistic and forward-thinking approach is fueled by insights from the Human Brain Project, advancements in brain imaging like EEG and fMRI, and recent strides in AI and computing, including quantum and neuromorphic designs. Additionally, this paper outlines how machine learning can play a role in crafting artificial consciousness, aiming to realise machine consciousness and awareness in the future.
- Oceania > Australia > Queensland (0.04)
- North America > United States > New York (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
Bounding Reconstruction Attack Success of Adversaries Without Data Priors
Ziller, Alexander, Riess, Anneliese, Schwethelm, Kristian, Mueller, Tamara T., Rueckert, Daniel, Kaissis, Georgios
Reconstruction attacks on machine learning (ML) models pose a strong risk of leakage of sensitive data. In specific contexts, an adversary can (almost) perfectly reconstruct training data samples from a trained model using the model's gradients. When training ML models with differential privacy (DP), formal upper bounds on the success of such reconstruction attacks can be provided. So far, these bounds have been formulated under worst-case assumptions that might not hold high realistic practicality. In this work, we provide formal upper bounds on reconstruction success under realistic adversarial settings against ML models trained with DP and support these bounds with empirical results. With this, we show that in realistic scenarios, (a) the expected reconstruction success can be bounded appropriately in different contexts and by different metrics, which (b) allows for a more educated choice of a privacy parameter.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
From chocolate bunny to chocolate crocodile: Do Language Models Understand Noun Compounds?
Noun compound interpretation is the task of expressing a noun compound (e.g. chocolate bunny) in a free-text paraphrase that makes the relationship between the constituent nouns explicit (e.g. bunny-shaped chocolate). We propose modifications to the data and evaluation setup of the standard task (Hendrickx et al., 2013), and show that GPT-3 solves it almost perfectly. We then investigate the task of noun compound conceptualization, i.e. paraphrasing a novel or rare noun compound. E.g., chocolate crocodile is a crocodile-shaped chocolate. This task requires creativity, commonsense, and the ability to generalize knowledge about similar concepts. While GPT-3's performance is not perfect, it is better than that of humans -- likely thanks to its access to vast amounts of knowledge, and because conceptual processing is effortful for people (Connell and Lynott, 2012). Finally, we estimate the extent to which GPT-3 is reasoning about the world vs. parroting its training data. We find that the outputs from GPT-3 often have significant overlap with a large web corpus, but that the parroting strategy is less beneficial for novel noun compounds.
- North America > United States > Colorado > Denver County > Denver (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Oceania > Australia (0.04)
- (12 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Neural Capacitated Clustering
Falkner, Jonas K., Schmidt-Thieme, Lars
Recent work on deep clustering has found new promising methods also for constrained clustering problems. Their typically pairwise constraints often can be used to guide the partitioning of the data. Many problems however, feature cluster-level constraints, e.g. the Capacitated Clustering Problem (CCP), where each point has a weight and the total weight sum of all points in each cluster is bounded by a prescribed capacity. In this paper we propose a new method for the CCP, Neural Capacited Clustering, that learns a neural network to predict the assignment probabilities of points to cluster centers from a data set of optimal or near optimal past solutions of other problem instances. During inference, the resulting scores are then used in an iterative k-means like procedure to refine the assignment under capacity constraints. In our experiments on artificial data and two real world datasets our approach outperforms several state-of-the-art mathematical and heuristic solvers from the literature. Moreover, we apply our method in the context of a cluster-first-route-second approach to the Capacitated Vehicle Routing Problem (CVRP) and show competitive results on the well-known Uchoa benchmark.
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States > New York (0.04)
- Europe > Italy > Trentino-Alto Adige/Südtirol > Trentino Province (0.04)
- Europe > Germany (0.04)
Reactive Answer Set Programming
Broda, Krysia, Sadri, Fariba, Butler, Stephen
Logic Production System (LPS) is a logic-based framework for modelling reactive behaviour. Based on abductive logic programming, it combines reactive rules with logic programs, a database and a causal theory that specifies transitions between the states of the database. This paper proposes a systematic mapping of the Kernel of this framework (called KELPS) into an answer set program (ASP). For this purpose a new variant of KELPS with finite models, called $n$-distance KELPS, is introduced. A formal definition of the mapping from this $n$-distance KELPS to ASP is given and proven sound and complete. The Answer Set Programming paradigm allows to capture additional behaviours to the basic reactivity of KELPS, in particular proactive, preemptive and prospective behaviours. These are all discussed and illustrated with examples. Then a hybrid framework is proposed that integrates KELPS and ASP, allowing to combine the strengths of both paradigms. Under consideration in Theory and Practice of Logic Programming (TPLP).
- North America > Canada (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Germany > Brandenburg > Potsdam (0.04)