typicality
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > San Diego County > La Jolla (0.04)
- Oceania > New Zealand (0.04)
- (5 more...)
- Law (0.72)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > San Diego County > La Jolla (0.04)
- Oceania > New Zealand (0.04)
- (5 more...)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Law > Civil Rights & Constitutional Law (0.31)
Exploring the Possibility of TypiClust for Low-Budget Federated Active Learning
Ono, Yuta, Nakamura, Hiroshi, Takase, Hideki
--Federated Active Learning (F AL) seeks to reduce the burden of annotation under the realistic constraints of federated learning by leveraging Active Learning (AL). As F AL settings make it more expensive to obtain ground truth labels, F AL strategies that work well in low-budget regimes, where the amount of annotation is very limited, are needed. In this work, we investigate the effectiveness of TypiClust, a successful low-budget AL strategy, in low-budget F AL settings. Our empirical results show that TypiClust works well even in low-budget F AL settings contrasted with relatively low performances of other methods, although these settings present additional challenges, such as data heterogeneity, compared to AL. In addition, we show that F AL settings cause distribution shifts in terms of typicality, but TypiClust is not very vulnerable to the shifts. We also analyze the sensitivity of TypiClust to feature extraction methods, and it suggests a way to perform F AL even in limited data situations.
Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable events
Michaelov, James A., Estacio, Reeka, Zhang, Zhien, Bergen, Benjamin K.
Can language models reliably predict that possible events are more likely than merely improbable ones? By teasing apart possibility, typicality, and contextual relatedness, we show that despite the results of previous work, language models' ability to do this is far from robust. In fact, under certain conditions, all models tested - including Llama 3, Gemma 2, and Mistral NeMo - perform at worse-than-chance level, assigning higher probabilities to impossible sentences such as 'the car was given a parking ticket by the brake' than to merely unlikely sentences such as 'the car was given a parking ticket by the explorer'.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > Jordan (0.04)
- (13 more...)
- Transportation (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
Cognitive and Cultural Topology of Linguistic Categories:A Semantic-Pragmatic Metric Approach
In recent years, the field of NLP has seen growing interest in modeling both semantic and pragmatic dimensions. Despite this progress, two key challenges persist: firstly, the complex task of mapping and analyzing the interactions between semantic and pragmatic features; secondly, the insufficient incorporation of relevant insights from related disciplines outside NLP. Addressing these issues, this study introduces a novel geometric metric that utilizes word co-occurrence patterns. This metric maps two fundamental properties - semantic typicality (cognitive) and pragmatic salience (socio-cultural) - for basic-level categories within a two-dimensional hyperbolic space. Our evaluations reveal that this semantic-pragmatic metric produces mappings for basic-level categories that not only surpass traditional cognitive semantics benchmarks but also demonstrate significant socio-cultural relevance. This finding proposes that basic-level categories, traditionally viewed as semantics-driven cognitive constructs, should be examined through the lens of both semantic and pragmatic dimensions, highlighting their role as a cognitive-cultural interface. The broad contribution of this paper lies in the development of medium-sized, interpretable, and human-centric language embedding models, which can effectively blend semantic and pragmatic dimensions to elucidate both the cognitive and socio-cultural significance of linguistic categories.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (11 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Ranking Counterfactual Explanations
Lim, Suryani, Prade, Henri, Richard, Gilles
AI-driven outcomes can be challenging for end-users to understand. Explanations can address two key questions: "Why this outcome?" (factual) and "Why not another?" (counterfactual). While substantial efforts have been made to formalize factual explanations, a precise and comprehensive study of counterfactual explanations is still lacking. This paper proposes a formal definition of counterfactual explanations, proving some properties they satisfy, and examining the relationship with factual explanations. Given that multiple counterfactual explanations generally exist for a specific case, we also introduce a rigorous method to rank these counterfactual explanations, going beyond a simple minimality condition, and to identify the optimal ones. Our experiments with 12 real-world datasets highlight that, in most cases, a single optimal counterfactual explanation emerges. We also demonstrate, via three metrics, that the selected optimal explanation exhibits higher representativeness and can explain a broader range of elements than a random minimal counterfactual. This result highlights the effectiveness of our approach in identifying more robust and comprehensive counterfactual explanations.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)
- Oceania > Australia (0.04)
- (7 more...)
- Information Technology > Security & Privacy (0.46)
- Government (0.46)
Generating event descriptions under syntactic and semantic constraints
Cao, Angela, Holt, Faye, Chan, Jonas, Richter, Stephanie, Glass, Lelia, White, Aaron Steven
With the goal of supporting scalable lexical semantic annotation, analysis, and theorizing, we conduct a comprehensive evaluation of different methods for generating event descriptions under both syntactic constraints -- e.g. desired clause structure -- and semantic constraints -- e.g. desired verb sense. We compare three different methods -- (i) manual generation by experts; (ii) sampling from a corpus annotated for syntactic and semantic information; and (iii) sampling from a language model (LM) conditioned on syntactic and semantic information -- along three dimensions of the generated event descriptions: (a) naturalness, (b) typicality, and (c) distinctiveness. We find that all methods reliably produce natural, typical, and distinctive event descriptions, but that manual generation continues to produce event descriptions that are more natural, typical, and distinctive than the automated generation methods. We conclude that the automated methods we consider produce event descriptions of sufficient quality for use in downstream annotation and analysis insofar as the methods used for this annotation and analysis are robust to a small amount of degradation in the resulting event descriptions.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (13 more...)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Temporal Many-valued Conditional Logics: a Preliminary Report
Alviano, Mario, Giordano, Laura, Dupré, Daniele Theseider
In this paper we propose a many-valued temporal conditional logic. We start from a many-valued logic with typicality, and extend it with the temporal operators of the Linear Time Temporal Logic (LTL), thus providing a formalism which is able to capture the dynamics of a system, trough strict and defeasible temporal properties. We also consider an instantiation of the formalism for gradual argumentation.
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > Monterey County > Pacific Grove (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Nonmonotonic Logic (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization
Ma, Haowen, Long, Zhiguo, Meng, Hua
Density-based clustering methods by mode-seeking usually achieve clustering by using local density estimation to mine structural information, such as local dependencies from lower density points to higher neighbors. However, they often rely too heavily on \emph{local} structures and neglect \emph{global} characteristics, which can lead to significant errors in peak selection and dependency establishment. Although introducing more hyperparameters that revise dependencies can help mitigate this issue, tuning them is challenging and even impossible on real-world datasets. In this paper, we propose a new algorithm (TANGO) to establish local dependencies by exploiting a global-view \emph{typicality} of points, which is obtained by mining further the density distributions and initial dependencies. TANGO then obtains sub-clusters with the help of the adjusted dependencies, and characterizes the similarity between sub-clusters by incorporating path-based connectivity. It achieves final clustering by employing graph-cut on sub-clusters, thus avoiding the challenging selection of cluster centers. Moreover, this paper provides theoretical analysis and an efficient method for the calculation of typicality. Experimental results on several synthetic and $16$ real-world datasets demonstrate the effectiveness and superiority of TANGO.
- Asia > Middle East > Jordan (0.04)
- Asia > China > Sichuan Province > Chengdu (0.04)
- North America > United States > New York (0.04)
Diffusion Models as Data Mining Tools
Siglidis, Ioannis, Holynski, Aleksander, Efros, Alexei A., Aubry, Mathieu, Ginosar, Shiry
This paper demonstrates how to use generative models trained for image synthesis as tools for visual data mining. Our insight is that since contemporary generative models learn an accurate representation of their training data, we can use them to summarize the data by mining for visual patterns. Concretely, we show that after finetuning conditional diffusion models to synthesize images from a specific dataset, we can use these models to define a typicality measure on that dataset. This measure assesses how typical visual elements are for different data labels, such as geographic location, time stamps, semantic labels, or even the presence of a disease. This analysis-by-synthesis approach to data mining has two key advantages. First, it scales much better than traditional correspondence-based approaches since it does not require explicitly comparing all pairs of visual elements. Second, while most previous works on visual data mining focus on a single dataset, our approach works on diverse datasets in terms of content and scale, including a historical car dataset, a historical face dataset, a large worldwide street-view dataset, and an even larger scene dataset. Furthermore, our approach allows for translating visual elements across class labels and analyzing consistent changes.
- Europe > United Kingdom (0.14)
- Asia > Thailand (0.05)
- South America > Brazil (0.05)
- (8 more...)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
- Leisure & Entertainment > Sports (0.46)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)