distinctiveness
Contrastive Learning for Image Captioning
Image captioning, a popular topic in computer vision, has achieved substantial progress in recent years. However, the distinctiveness of natural descriptions is often overlooked in previous work. It is closely related to the quality of captions, as distinctive captions are more likely to describe images with their unique aspects. In this work, we propose a new learning method, Contrastive Learning (CL), for image captioning. Specifically, via two constraints formulated on top of a reference model, the proposed method can encourage distinctiveness, while maintaining the overall quality of the generated captions. We tested our method on two challenging datasets, where it improves the baseline model by significant margins. We also showed in our studies that the proposed method is generic and can be used for models with various structures.
- Asia > China > Hong Kong (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Singapore (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models
Zhang, Xiyuan, Maddix, Danielle C., Yin, Junming, Erickson, Nick, Ansari, Abdul Fatir, Han, Boran, Zhang, Shuai, Akoglu, Leman, Faloutsos, Christos, Mahoney, Michael W., Hu, Cuixiong, Rangwala, Huzefa, Karypis, George, Wang, Bernie
Since the seminal work of TabPFN, research on tabular foundation models (TFMs) based on in-context learning (ICL) has challenged long-standing paradigms in machine learning. Without seeing any real-world data, models pretrained on purely synthetic datasets generalize remarkably well across diverse datasets, often using only a moderate number of in-context examples. This shifts the focus in tabular machine learning from model architecture design to the design of synthetic datasets, or, more precisely, to the prior distributions that generate them. Yet the guiding principles for prior design remain poorly understood. This work marks the first attempt to address the gap. We systematically investigate and identify key properties of synthetic priors that allow pretrained TFMs to generalize well. Based on these insights, we introduce Mitra, a TFM trained on a curated mixture of synthetic priors selected for their diversity, distinctiveness, and performance on real-world tabular data. Mitra consistently outperforms state-of-the-art TFMs, such as TabPFNv2 and TabICL, across both classification and regression benchmarks, with better sample efficiency.
The Basic B*** Effect: The Use of LLM-based Agents Reduces the Distinctiveness and Diversity of People's Choices
Matz, Sandra C., Horton, C. Blaine, Goethals, Sofie
Large language models (LLMs) increasingly act on people's behalf: they write emails, buy groceries, and book restaurants. While the outsourcing of human decision-making to AI can be both efficient and effective, it raises a fundamental question: how does delegating identity-defining choices to AI reshape who people become? We study the impact of agentic LLMs on two identity-relevant outcomes: interpersonal distinctiveness - how unique a person's choices are relative to others - and intrapersonal diversity - the breadth of a single person's choices over time. Using real choices drawn from social-media behavior of 1,000 U.S. users (110,000 choices in total), we compare a generic and personalized agent to a human baseline. Both agents shift people's choices toward more popular options, reducing the distinctiveness of their behaviors and preferences. While the use of personalized agents tempers this homogenization (compared to the generic AI), it also more strongly compresses the diversity of people's preference portfolios by narrowing what they explore across topics and psychological affinities. Understanding how AI agents might flatten human experience, and how using generic versus personalized agents involves distinctiveness-diversity trade-offs, is critical for designing systems that augment rather than constrain human agency, and for safeguarding diversity in thought, taste, and expression.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > Singapore (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
The role of media memorability in facilitating startups' access to venture capital funding
Toschi, L., Torrisi, S., Colladon, A. Fronzetti
Media reputation plays an important role in attracting venture capital investment. However, prior research has focused too narrowly on general media exposure, limiting our understanding of how media truly influences funding decisions. As informed decision-makers, venture capitalists respond to more nuanced aspects of media content. We introduce the concept of media memorability - the media's ability to imprint a startup's name in the memory of relevant investors. Using data from 197 UK startups in the micro and nanotechnology sector (funded between 1995 and 2004), we show that media memorability significantly influences investment outcomes. Our findings suggest that venture capitalists rely on detailed cues such as a startup's distinctiveness and connectivity within news semantic networks. This contributes to research on entrepreneurial finance and media legitimation. In practice, startups should go beyond frequent media mentions to strengthen brand memorability through more targeted, meaningful coverage highlighting their uniqueness and relevance within the broader industry conversation.
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
- Asia > China (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
RE-TRIP : Reflectivity Instance Augmented Triangle Descriptor for 3D Place Recognition
Park, Yechan, Pak, Gyuhyeon, Kim, Euntai
While most people associate LiDAR primarily with its ability to measure distances and provide geometric information about the environment (via point clouds), LiDAR also captures additional data, including reflectivity or intensity values. Unfortunately, when LiDAR is applied to Place Recognition (PR) in mobile robotics, most previous works on LiDAR-based PR rely only on geometric measurements, neglecting the additional reflectivity information that LiDAR provides. In this paper, we propose a novel descriptor for 3D PR, named RE-TRIP (REflectivity-instance augmented TRIangle descriPtor). This new descriptor leverages both geometric measurements and reflectivity to enhance robustness in challenging scenarios such as geometric degeneracy, high geometric similarity, and the presence of dynamic objects. To implement RE-TRIP in real-world applications, we further propose (1) a keypoint extraction method, (2) a key instance segmentation method, (3) a RE-TRIP matching method, and (4) a reflectivity-combined loop verification method. Finally, we conduct a series of experiments to demonstrate the effectiveness of RE-TRIP. Applied to public datasets (i.e., HELIPR, FusionPortable) containing diverse scenarios such as long corridors, bridges, large-scale urban areas, and highly dynamic environments -- our experimental results show that the proposed method outperforms existing state-of-the-art methods in terms of Scan Context, Intensity Scan Context, and STD.
- Information Technology > Artificial Intelligence > Vision (0.94)
- Information Technology > Artificial Intelligence > Robots (0.94)
BCE vs. CE in Deep Feature Learning
Li, Qiufu, Xiao, Huibin, Shen, Linlin
When training classification models, it expects that the learned features are compact within classes, and can well separate different classes. As the dominant loss function for training classification models, minimizing cross-entropy (CE) loss maximizes the compactness and distinctiveness, i.e., reaching neural collapse (NC). The recent works show that binary CE (BCE) performs also well in multi-class tasks. In this paper, we compare BCE and CE in deep feature learning. For the first time, we prove that BCE can also maximize the intra-class compactness and inter-class distinctiveness when reaching its minimum, i.e., leading to NC. We point out that CE measures the relative values of decision scores in the model training, implicitly enhancing the feature properties by classifying samples one-by-one. In contrast, BCE measures the absolute values of decision scores and adjust the positive/negative decision scores across all samples to uniformly high/low levels. Meanwhile, the classifier biases in BCE present a substantial constraint on the decision scores to explicitly enhance the feature properties in the training. The experimental results are aligned with above analysis, and show that BCE could improve the classification and leads to better compactness and distinctiveness among sample features. The codes will be released.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > Canada (0.04)
Reviews: Contrastive Learning for Image Captioning
The paper proposed a contrastive learning approach for image captioning models. Typical image captioning models utilize log-likelihood criteria for learning, which tends to result in preferring a safer generation that lacks specific and distinct concept in an image. The paper proposes to introduce contrastive learning objective, where the objective function is based on density ratio to the reference, without altering the captioning models. The paper evaluates multiple models in MSCOCO and InstaPIC datasets, and demonstrates the effectiveness as well as conducts ablation studies. The paper is well-written and has strength in the following points.