maximum entropy
DA W: Exploring the Better Weighting Function for Semi-supervised Semantic Segmentation Supplementary Material Rui Sun 1 Huayu Mai
In the supplementary material, we first introduce the pseudo algorithm of DA W . Then we clarify the Then, we provide a more detailed explanation of Figures 1, 2, 4, and 5, which are slightly abbreviated due to the limited space of the main paper. In the naive pseudo-labeling method, all pseudo labels are enrolled into training, i.e., E 1 + E 2, which is guaranteed by theoretical functional analysis in the next section. Inequality 45 holds true at all times. In this section, we provide more qualitative results between ours and other competitors.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.51)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.33)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
Breaking the bonds of generative artificial intelligence by minimizing the maximum entropy
Miotto, Mattia, Monacelli, Lorenzo
The emergence of generative artificial intelligence (GenAI), comprising large language models, text-to-image generators, and AI algorithms for medical drug and material design, had a transformative impact on society. However, despite an initial exponential growth surpassing Moore's law, progress is now plateauing, suggesting we are approaching the limits of current technology. Indeed, these models are notoriously data-hungry, prone to overfitting, and challenging to direct during the generative process, hampering their effective professional employment. To cope with these limitations, we propose a paradigm shift in GenAI by introducing an ab initio method based on the minimal maximum entropy principle. Our approach does not fit the data. Instead, it compresses information in the training set by finding a latent representation parameterized by arbitrary nonlinear functions, such as neural networks. The result is a general physics-driven model, which is data-efficient, resistant to overfitting, and flexible, permitting to control and influence the generative process. Benchmarking shows that our method outperforms variational autoencoders (VAEs) with similar neural architectures, particularly on undersampled datasets. We demonstrate the methods effectiveness in generating images, even with limited training data, and its unprecedented capability to customize the generation process a posteriori without the need of any fine-tuning or retraining.
- Europe > Italy > Lazio > Rome (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > California > Orange County > Anaheim (0.04)
Data-Driven Priors in the Maximum Entropy on the Mean Method for Linear Inverse Problems
King-Roskamp, Matthew, Choksi, Rustum, Hoheisel, Tim
We establish the theoretical framework for implementing the maximumn entropy on the mean (MEM) method for linear inverse problems in the setting of approximate (data-driven) priors. We prove a.s. convergence for empirical means and further develop general estimates for the difference between the MEM solutions with different priors $\mu$ and $\nu$ based upon the epigraphical distance between their respective log-moment generating functions. These estimates allow us to establish a rate of convergence in expectation for empirical means. We illustrate our results with denoising on MNIST and Fashion-MNIST data sets.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy (0.04)
Improving Pre-Trained Self-Supervised Embeddings Through Effective Entropy Maximization
Chakraborty, Deep, LeCun, Yann, Rudner, Tim G. J., Learned-Miller, Erik
Self-supervised learning (SSL) methods are widely employed for pre-training features on unlabeled data and are highly effective for subsequent fine-tuning on a wide variety of downstream tasks [Che+20; Gri+20; Car+20; BPL21]. In this paper, we ask whether it is possible to formulate a well-motivated, general-purpose criterion that allows further improving already-trained, highly-optimized SSL embeddings with only a handful of epochs of continued pre-training. Like several previous works [BJ17; WI20; Liu+22; Ozs+22], we start with the principle of maximizing the entropy of embeddings. One well-known motivation for this is that for a discrete embedding space, maximizing the entropy of a deterministic mapping preserves as much information as possible about the inputs. That is, such a maximum-entropy embedding maximizes the mutual information between the embedding and the input distribution [see, for example, Hje+18]. Similar results hold for continuous embeddings under appropriate noise models [see, for example, discussion of the Gaussian channel in CT91]. By maximizing the amount of information retained, one hopes to prepare as well as possible for future, as-yet-unknown, discrimination tasks. Our contribution is thus not the maximization of embedding entropy, but rather how we go about it.
Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation
Arias, Esteban Garces, Rodemann, Julian, Li, Meimingwei, Heumann, Christian, Aßenmacher, Matthias
Decoding from the output distributions of large language models to produce high-quality text is a complex challenge in language modeling. Various approaches, such as beam search, sampling with temperature, $k-$sampling, nucleus $p-$sampling, typical decoding, contrastive decoding, and contrastive search, have been proposed to address this problem, aiming to improve coherence, diversity, as well as resemblance to human-generated text. In this study, we introduce adaptive contrastive search, a novel decoding strategy extending contrastive search by incorporating an adaptive degeneration penalty, guided by the estimated uncertainty of the model at each generation step. This strategy is designed to enhance both the creativity and diversity of the language modeling process while at the same time producing coherent and high-quality generated text output. Our findings indicate performance enhancement in both aspects, across different model architectures and datasets, underscoring the effectiveness of our method in text generation tasks. Our code base, datasets, and models are publicly available.
- Asia > Afghanistan > Kabul Province > Kabul (0.04)
- North America > United States > New York (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (7 more...)
- Transportation (1.00)
- Law Enforcement & Public Safety (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
- Government > Military > Army (0.46)
On Maximum Entropy Linear Feature Inversion
We revisit the classical problem of inverting dimension-reducing linear mappings using the maximum entropy (MaxEnt) criterion. In the literature, solutions are problem-dependent, inconsistent, and use different entropy measures. We propose a new unified approach that not only specializes to the existing approaches, but offers solutions to new cases, such as when data values are constrained to [0, 1], which has new applications in machine learning.
- North America > United States > New Jersey (0.04)
- Europe > Spain > Galicia > A Coruña Province > A Coruña (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Germany (0.04)