ascent
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- (3 more...)
Appendix
Details regarding the datasets used in the experiments are included in Table 2. For Yang et al. [2020], we progressively doubled the number of regions searched which is the only adjustable hyperparameter. To make this figure, we run all the experiments (all attacks, datasets, and choices of hyperparameters)onaserverwith40coresofIntel(R)Xeon(R)Gold6230CPU@2.10GHz. This outcome is seemingly perplexing than the previous one. We explain it for different values ofm, namely the small-mandthelarge-mregions.
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
Manifold Trajectories in Next-Token Prediction: From Replicator Dynamics to Softmax Equilibrium
Decoding in large language models is often described as scoring tokens and normalizing with softmax. We give a minimal, self-contained account of this step as a constrained variational principle on the probability simplex. The discrete, normalization-respecting ascent is the classical multiplicative-weights (entropic mirror) update; its continuous-time limit is the replicator flow. From these ingredients we prove that, for a fixed context and temperature, the next-token distribution follows a smooth trajectory inside the simplex and converges to the softmax equilibrium. This formalizes the common ``manifold traversal'' intuition at the output-distribution level. The analysis yields precise, practice-facing consequences: temperature acts as an exact rescaling of time along the same trajectory, while top-k and nucleus sampling restrict the flow to a face with identical guarantees. We also outline a controlled account of path-dependent score adjustments and their connection to loop-like, hallucination-style behavior. We make no claims about training dynamics or internal representations; those are deferred to future work.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- Asia > Middle East > Jordan (0.05)
Supplementary Material Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time
In the main text, many algorithmic details were omitted and only discussed briefly. A.1 Dataset Details We expand upon the seven datasets used for our experiments in this section. The task is multi-class classification with a heavy class imbalance. It has 8 features including price, day of the week and units transferred. We discard instances with missing values.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- (2 more...)
- Asia > Middle East > Jordan (0.04)
- North America > United States (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
Realistic Image-to-Image Machine Unlearning via Decoupling and Knowledge Retention
Varshney, Ayush K., Torra, Vicenç
Machine Unlearning allows participants to remove their data from a trained machine learning model in order to preserve their privacy, and security. However, the machine unlearning literature for generative models is rather limited. The literature for image-to-image generative model (I2I model) considers minimizing the distance between Gaussian noise and the output of I2I model for forget samples as machine unlearning. However, we argue that the machine learning model performs fairly well on unseen data i.e., a retrained model will be able to catch generic patterns in the data and hence will not generate an output which is equivalent to Gaussian noise. In this paper, we consider that the model after unlearning should treat forget samples as out-of-distribution (OOD) data, i.e., the unlearned model should no longer recognize or encode the specific patterns found in the forget samples. To achieve this, we propose a framework which decouples the model parameters with gradient ascent, ensuring that forget samples are OOD for unlearned model with theoretical guarantee. We also provide $(\epsilon, \delta)$-unlearning guarantee for model updates with gradient ascent. The unlearned model is further fine-tuned on the remaining samples to maintain its performance. We also propose an attack model to ensure that the unlearned model has effectively removed the influence of forget samples. Extensive empirical evaluation on two large-scale datasets, ImageNet-1K and Places365 highlights the superiority of our approach. To show comparable performance with retrained model, we also show the comparison of a simple AutoEncoder on various baselines on CIFAR-10 dataset.
- Information Technology > Security & Privacy (1.00)
- Law (0.93)
ASCENT: Amplifying Power Side-Channel Resilience via Learning & Monte-Carlo Tree Search
Bhandari, Jitendra, Chowdhury, Animesh Basak, Nabeel, Mohammed, Sinanoglu, Ozgur, Garg, Siddharth, Karri, Ramesh, Knechtel, Johann
Power side-channel (PSC) analysis is pivotal for securing cryptographic hardware. Prior art focused on securing gate-level netlists obtained as-is from chip design automation, neglecting all the complexities and potential side-effects for security arising from the design automation process. That is, automation traditionally prioritizes power, performance, and area (PPA), sidelining security. We propose a "security-first" approach, refining the logic synthesis stage to enhance the overall resilience of PSC countermeasures. We introduce ASCENT, a learning-and-search-based framework that (i) drastically reduces the time for post-design PSC evaluation and (ii) explores the security-vs-PPA design space. Thus, ASCENT enables an efficient exploration of a large number of candidate netlists, leading to an improvement in PSC resilience compared to regular PPA-optimized netlists. ASCENT is up to 120x faster than traditional PSC analysis and yields a 3.11x improvement for PSC resilience of state-of-the-art PSC countermeasures
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > New York (0.04)
Fast Variational Inference in the Conjugate Exponential Family
We present a general method for deriving collapsed variational inference algorithms for probabilistic models in the conjugate exponential family. Our method unifies many existing approaches to collapsed variational inference. Our collapsed variational inference leads to a new lower bound on the marginal likelihood. We exploit the information geometry of the bound to derive much faster optimization methods based on conjugate gradients for these models. Our approach is very general and is easily applied to any model where the mean field update equations have been derived. Empirically we show significant speed-ups for probabilistic inference using our bound.
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)
ChatGPT Made OpenAI a Powerhouse. Here's What Could Undo It.
This article is from Big Technology, a newsletter by Alex Kantrowitz. It's been a year of glossy profiles, breathless accolades, and billions in new funding for OpenAI, but the ChatGPT maker is far more vulnerable than the popular narrative suggests. Amid a seemingly unstoppable ascent, the company is facing fierce competition, a rising open-source movement, and pressure to deliver hits in an unpredictable discipline. While its marquee product has become practically synonymous with A.I., its perch atop the field is less than rock solid. OpenAI's weakness stems in part from its strength. It popularized generative A.I. by taking others' innovations--like the transformer model--and building stellar products on top of them.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)