Asia
Murata beats profit estimates as AI data-center demand strains production
The company is the world's leading supplier of multilayer ceramic capacitors, essential components for every device that uses electricity because they regulate power flow. Murata Manufacturing has reported fourth-quarter earnings that beat analyst estimates, fueled by robust demand from artificial-intelligence data-center builders. Net income in the three months through March was ยฅ76.57 billion ($477 million), the Kyoto-based company said Thursday. Analysts had estimated ยฅ60 billion on average. Revenue was ยฅ460.62 billion, also better than expected.
China to ban drone sales in Beijing citing security concerns
China will ban the sale of drones in Beijing and require permits to fly them under new rules that take effect on Friday. Drones and key components will be prohibited from being sold, rented or brought into the Chinese capital. Drone owners will also be required to register their devices with the police. China has gradually tightened regulations on drones in recent years, with authorities citing public safety concerns. Drones and flying taxis are part of the so-called low-altitude economy, a strategic priority for China that is expected to generate more than two trillion yuan ($290bn; ยฃ217bn) by 2035.
Appendix for "Episodic Multi-Task Learning with Heterogeneous Neural Processes "
In this section, we list frequently asked questions from researchers who help proofread this manuscript. These raised questions might also be relevant for others and help in better understanding the paper, so we include more detailed discussions here. This work considers the multi-input multi-output setting of multi-task learning under the episodic training mechanism. As shown in Table 1, we use "Heterogeneous tasks" to distinguish the different branches of multi-task learning: (1) single-input multi-output (SIMO) considers different tasks which have the same input and different supervision information. All tasks are related since they share the target space. This setting encourages deep models to deal with the insufficient data of each task by aggregating the training data from related tasks in the spirit of data augmentation. Meanwhile, "Episodic training" is used to describe the data-feeding strategy. Multi-task meta-learning also benefits from episodic training, but it follows the SIMO setting in every single episode and cannot sufficiently handle heterogeneous tasks.
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Transformer models have been widely adopted in various domains over the last years, and especially large language models have advanced the field of AI significantly. Due to their size, the capability of these networks has increased tremendously, but this has come at the cost of a significant increase in necessary compute. Quantization is one of the most effective ways to reduce the computational time and memory consumption of neural networks. Many studies have shown, however, that modern transformer models tend to learn strong outliers in their activations, making them difficult to quantize. To retain acceptable performance, the existence of these outliers requires activations to be in higher bitwidth or the use of different numeric formats, extra fine-tuning, or other workarounds.
ed3fea9033a80fea1376299fa7863f4a-Paper-Conference.pdf
Large Language Models (LLMs) can achieve strong performance on many tasks by producing step-by-step reasoning before giving a final output, often referred to as chain-of-thought reasoning (CoT). It is tempting to interpret these CoT explanations as the LLM's process for solving a task. This level of transparency into LLMs' predictions would yield significant safety benefits. However, we find that CoT explanations can systematically misrepresent the true reason for a model's prediction. We demonstrate that CoT explanations can be heavily influenced by adding biasing features to model inputs--e.g., by reordering the multiple-choice options in a few-shot prompt to make the answer always "(A)"--which models systematically fail to mention in their explanations.
Towards a Unified Analysis of Kernel-based Methods Under Covariate Shift
Covariate shift occurs prevalently in practice, where the input distributions of the source and target data are substantially different. Despite its practical importance in various learning problems, most of the existing methods only focus on some specific learning tasks and are not well validated theoretically and numerically. To tackle this problem, we propose a unified analysis of general nonparametric methods in a reproducing kernel Hilbert space (RKHS) under covariate shift. Our theoretical results are established for a general loss belonging to a rich loss function family, which includes many commonly used methods as special cases, such as mean regression, quantile regression, likelihood-based classification, and margin-based classification. Two types of covariate shift problems are the focus of this paper and the sharp convergence rates are established for a general loss function to provide a unified theoretical analysis, which concurs with the optimal results in literature where the squared loss is used. Extensive numerical studies on synthetic and real examples confirm our theoretical findings and further illustrate the effectiveness of our proposed method.