refresh
Graph Coloring for Multi-Task Learning
Patapati, Santosh, Srinivasan, Trisanth
When different objectives conflict with each other in multi-task learning, gradients begin to interfere and slow convergence, thereby potentially reducing the final model's performance. To address this, we introduce SON-GOKU, a scheduler that computes gradient interference, constructs an interference graph, and then applies greedy graph-coloring to partition tasks into groups that align well with each other. At each training step, only one group (color class) of tasks are activated, and the grouping partition is constantly recomputed as task relationships evolve throughout training. By ensuring that each mini-batch contains only tasks that pull the model in the same direction, our method improves the effectiveness of any underlying multi-task learning optimizer without additional tuning. Since tasks within these groups will update in compatible directions, multi-task learning will improve model performance rather than impede it. Empirical results on six different datasets show that this interference-aware graph-coloring approach consistently outperforms baselines and state-of-the-art multi-task optimizers. We provide extensive theory showing why grouping and sequential updates improve multi-task learning, with guarantees on descent, convergence, and accurately identifying what tasks conflict or align.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Texas (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.68)
MARché: Fast Masked Autoregressive Image Generation with Cache-Aware Attention
Jiang, Chaoyi, Kim, Sungwoo, Gao, Lei, Zarch, Hossein Entezari, Ro, Won Woo, Annavaram, Murali
Masked autoregressive (MAR) models unify the strengths of masked and autoregressive generation by predicting tokens in a fixed order using bidirectional attention for image generation. While effective, MAR models suffer from significant computational overhead, as they recompute attention and feed-forward representations for all tokens at every decoding step, despite most tokens remaining semantically stable across steps. We propose a training-free generation framework MARché to address this inefficiency through two key components: cache-aware attention and selective KV refresh. Cache-aware attention partitions tokens into active and cached sets, enabling separate computation paths that allow efficient reuse of previously computed key/value projections without compromising full-context modeling. But a cached token cannot be used indefinitely without recomputation due to the changing contextual information over multiple steps. MARché recognizes this challenge and applies a technique called selective KV refresh. Selective KV refresh identifies contextually relevant tokens based on attention scores from newly generated tokens and updates only those tokens that require recomputation, while preserving image generation quality. MARché significantly reduces redundant computation in MAR without modifying the underlying architecture. Empirically, MARché achieves up to 1.7x speedup with negligible impact on image quality, offering a scalable and broadly applicable solution for efficient masked transformer generation.
Efficient Quantum-Safe Homomorphic Encryption for Quantum Computer Programs
We present a lattice-based scheme for homomorphic evaluation of quantum programs and proofs that remains secure against quantum adversaries. Classical homomorphic encryption is lifted to the quantum setting by replacing composite-order groups with Module Learning-With-Errors (MLWE) lattices and by generalizing polynomial functors to bounded natural super functors (BNSFs). A secret depolarizing BNSF mask hides amplitudes, while each quantum state is stored as an MLWE ciphertext pair. We formalize security with the qIND-CPA game that allows coherent access to the encryption oracle and give a four-hybrid reduction to decisional MLWE. The design also covers practical issues usually left open. A typed QC-bridge keeps classical bits produced by measurements encrypted yet still usable as controls, with weak-measurement semantics for expectation-value workloads. Encrypted Pauli twirls add circuit privacy. If a fixed knowledge base is needed, its axioms are shipped as MLWE "capsules"; the evaluator can use them but cannot read them. A rho-calculus driver schedules encrypted tasks across several QPUs and records an auditable trace on an RChain-style ledger. Performance analysis shows that the extra lattice arithmetic fits inside today's QPU idle windows: a 100-qubit, depth-10^3 teleportation-based proof runs in about 10 ms, the public key (seed only) is 32 bytes, and even a CCA-level key stays below 300 kB. A photonic Dirac-3 prototype that executes homomorphic teleportation plus knowledge-base-relative amplitude checks appears feasible with current hardware. These results indicate that fully homomorphic, knowledge-base-aware quantum reasoning is compatible with near-term quantum clouds and standard post-quantum security assumptions.
- Research Report (0.50)
- Workflow (0.46)
- Information Technology > Hardware (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Amazon is holding a devices event on February 26, here's what to expect
Amazon is holding an event on February 26 at 10AM ET, and that's unusually early in the year for the company, which typically has its launches in the fall like the rest of its peers. However, considering the last time Amazon had a "devices and services" showcase was in September 2023, this one is overdue. While we don't exactly know what the company plans on showing off, we certainly have some educated guesses. First of all, the company's hardware chief, Panos Panay, and his devices and services team will be on hand. This indicates the presence of new gadgets at the event.
Amazon is holding a devices event on February 26, here's what to expect
Amazon is holding an event on February 26 at 10AM ET, and that's unusually early in the year for the company, which typically has its launches in the fall like the rest of its peers. However, considering the last time Amazon had a "devices and services" showcase was in September 2023, this one is overdue. While we don't exactly know what the company plans on showing off, we certainly have some educated guesses. First of all, the company's hardware chief, Panos Panay, and his devices and services team will be on hand. This indicates the presence of new gadgets at the event. However, the main focus will likely be more information on the long-promised next-gen Alexa.
What to expect at Amazon's devices event on February 26
Amazon is holding an event on February 26 at 10AM ET. While we don't exactly know what the company plans on showing off, we certainly have some educated guesses. First of all, the company's hardware chief, Panos Panay, and his devices and services team will be on hand. This indicates the presence of new gadgets at the event. However, the main focus will likely be more information on the long-promised next-gen Alexa. Seems like the smarter and'remarkable' version of Amazon's Alexa is finally launching on Feb 26th.
Enhancement of Subjective Content Descriptions by using Human Feedback
Bender, Magnus, Braun, Tanya, Möller, Ralf, Gehrke, Marcel
An agent providing an information retrieval service may work with a corpus of text documents. The documents in the corpus may contain annotations such as Subjective Content Descriptions (SCD) -- additional data associated with different sentences of the documents. Each SCD is associated with multiple sentences of the corpus and has relations among each other. The agent uses the SCDs to create its answers in response to queries supplied by users. However, the SCD the agent uses might reflect the subjective perspective of another user. Hence, answers may be considered faulty by an agent's user, because the SCDs may not exactly match the perceptions of an agent's user. A naive and very costly approach would be to ask each user to completely create all the SCD themselves. To use existing knowledge, this paper presents ReFrESH, an approach for Relation-preserving Feedback-reliant Enhancement of SCDs by Humans. An agent's user can give feedback about faulty answers to the agent. This feedback is then used by ReFrESH to update the SCDs incrementally. However, human feedback is not always unambiguous. Therefore, this paper additionally presents an approach to decide how to incorporate the feedback and when to update the SCDs. Altogether, SCDs can be updated with human feedback, allowing users to create even more specific SCDs for their needs.
- Europe > Germany > North Rhine-Westphalia > Münster Region > Münster (0.04)
- Europe > Germany > Hamburg (0.04)
A Unified and General Framework for Continual Learning
Wang, Zhenyi, Li, Yan, Shen, Li, Huang, Heng
Continual Learning (CL) focuses on learning from dynamic and changing data distributions while retaining previously acquired knowledge. Various methods have been developed to address the challenge of catastrophic forgetting, including regularization-based, Bayesian-based, and memory-replay-based techniques. However, these methods lack a unified framework and common terminology for describing their approaches. This research aims to bridge this gap by introducing a comprehensive and overarching framework that encompasses and reconciles these existing methodologies. Notably, this new framework is capable of encompassing established CL approaches as special instances within a unified and general optimization objective. An intriguing finding is that despite their diverse origins, these methods share common mathematical structures. This observation highlights the compatibility of these seemingly distinct techniques, revealing their interconnectedness through a shared underlying optimization objective. Moreover, the proposed general framework introduces an innovative concept called refresh learning, specifically designed to enhance the CL performance. This novel approach draws inspiration from neuroscience, where the human brain often sheds outdated information to improve the retention of crucial knowledge and facilitate the acquisition of new information. In essence, refresh learning operates by initially unlearning current data and subsequently relearning it. It serves as a versatile plug-in that seamlessly integrates with existing CL methods, offering an adaptable and effective enhancement to the learning process. Extensive experiments on CL benchmarks and theoretical analysis demonstrate the effectiveness of the proposed refresh learning. Code is available at \url{https://github.com/joey-wang123/CL-refresh-learning}.
- Education (0.66)
- Information Technology > Security & Privacy (0.46)
- Health & Medicine > Therapeutic Area (0.34)
REFRESH: Responsible and Efficient Feature Reselection Guided by SHAP Values
Sharma, Shubham, Dutta, Sanghamitra, Albini, Emanuele, Lecue, Freddy, Magazzeni, Daniele, Veloso, Manuela
Feature selection is a crucial step in building machine learning models. This process is often achieved with accuracy as an objective, and can be cumbersome and computationally expensive for large-scale datasets. Several additional model performance characteristics such as fairness and robustness are of importance for model development. As regulations are driving the need for more trustworthy models, deployed models need to be corrected for model characteristics associated with responsible artificial intelligence. When feature selection is done with respect to one model performance characteristic (eg. accuracy), feature selection with secondary model performance characteristics (eg. fairness and robustness) as objectives would require going through the computationally expensive selection process from scratch. In this paper, we introduce the problem of feature \emph{reselection}, so that features can be selected with respect to secondary model performance characteristics efficiently even after a feature selection process has been done with respect to a primary objective. To address this problem, we propose REFRESH, a method to reselect features so that additional constraints that are desirable towards model performance can be achieved without having to train several new models. REFRESH's underlying algorithm is a novel technique using SHAP values and correlation analysis that can approximate for the predictions of a model without having to train these models. Empirical evaluations on three datasets, including a large-scale loan defaulting dataset show that REFRESH can help find alternate models with better model characteristics efficiently. We also discuss the need for reselection and REFRESH based on regulation desiderata.
- North America > Canada > Quebec > Montreal (0.05)
- North America > United States > Virginia (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (4 more...)
- Law (1.00)
- Banking & Finance (1.00)
LG's latest Gram laptops are predictably stuffed with AI features
LG just announced new entries in its gram series of laptops as part of an early CES 2024 reveal. These include two new LG Gram Pro laptops and standard refreshes of the pre-existing gram line. The LG Gram Pro boasts impressive specs, with an Intel Core Ultra processor and a GeForce RTX 3050 GPU. These computers also ship with Intel's AI Boost technology. LG says this upgrade allows the laptop to "handle AI workloads even without a network connection."
- Information Technology > Hardware (0.55)
- Information Technology > Artificial Intelligence (0.52)