cbe
UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization
Yuan, Peiwen, Feng, Shaoxiong, Li, Yiwei, Wang, Xinglin, Zhang, Yueqi, Shi, Jiayi, Tan, Chuyi, Pan, Boyuan, Hu, Yao, Li, Kan
Human preference plays a significant role in measuring large language models and guiding them to align with human values. Unfortunately, current comparing-based evaluation (CBE) methods typically focus on a single optimization objective, failing to effectively utilize scarce yet valuable preference signals. To address this, we delve into key factors that can enhance the accuracy, convergence, and scalability of CBE: suppressing sampling bias, balancing descending process of uncertainty, and mitigating updating uncertainty. Following the derived guidelines, we propose UniCBE, a unified uniformity-driven CBE framework which simultaneously optimize these core objectives by constructing and integrating three decoupled sampling probability matrices, each designed to ensure uniformity in specific aspects. We further ablate the optimal tuple sampling and preference aggregation strategies to achieve efficient CBE. On the AlpacaEval benchmark, UniCBE saves over 17% of evaluation budgets while achieving a Pearson correlation with ground truth exceeding 0.995, demonstrating excellent accuracy and convergence. In scenarios where new models are continuously introduced, UniCBE can even save over 50% of evaluation costs, highlighting its improved scalability.
- Europe > Austria > Vienna (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (7 more...)
A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification
Wu, Jiaqi, Pang, Junbiao, Zhang, Baochang, Huang, Qingming
Semi-supervised learning (SSL) is a practical challenge in computer vision. Pseudo-label (PL) methods, e.g., FixMatch and FreeMatch, obtain the State Of The Art (SOTA) performances in SSL. These approaches employ a threshold-to-pseudo-label (T2L) process to generate PLs by truncating the confidence scores of unlabeled data predicted by the self-training method. However, self-trained models typically yield biased and high-variance predictions, especially in the scenarios when a little labeled data are supplied. To address this issue, we propose a lightweight channel-based ensemble method to effectively consolidate multiple inferior PLs into the theoretically guaranteed unbiased and low-variance one. Importantly, our approach can be readily extended to any SSL framework, such as FixMatch or FreeMatch. Experimental results demonstrate that our method significantly outperforms state-of-the-art techniques on CIFAR10/100 in terms of effectiveness and efficiency.
On the Optimal Bit Complexity of Circulant Binary Embedding
Kim, Saehoon (POSTECH and AItrics) | Kim, Jungtaek (POSTECH) | Choi, Seungjin (POSTECH)
Binary embedding refers to methods for embedding points in R d into vertices of a Hamming cube of dimension k, such that the normalized Hamming distance well preserves the pre-defined similarity between vectors in the original space. A common approach to binary embedding is to use random projection with unstructured projection, followed by one-bit quantization to produce binary codes, which has been proven that k = O (ε -2 log n ) is required to approximate the angle up to epsilon-distortion, where n is the number of data. Of particular interest in this paper is circulant binary embedding (CBE) with angle preservation, where a random circulant matrix is used for projection. It yields comparable performance while achieving the nearly linear time and space complexities, compared to embedding methods relying on unstructured projection. To support promising empirical results, several non-asymptotic analysis have been introduced to establish conditions on the number of bits to meet epsilon-distortion embedding, where one of state-of-the-art achieves the optimal sample complexity k = O (ε -3 log n ) while the distortion rate ε -3 is far from the optimality, compared to k = O (ε -2 log n ). In this paper, to support promising empirical results of CBE, we extend the previous theoretical framework to address the optimal condition on the number of bits, achieving that CBE with k = O(ε -2 log n) approximates the angle up to ε-distortion under mild assumptions. We also provide numerical experiments to support our theoretical results.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Afghanistan > Parwan Province > Charikar (0.05)
- Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)
New Year Honours 2018: Barry Gibb, Ringo Starr and Darcey Bussell head list
Bee Gees singer Barry Gibb and Beatles drummer Ringo Starr have been knighted, and Strictly judge Darcey Bussell made a dame, in the New Year Honours. Ex-Deputy PM Nick Clegg and War Horse novelist Michael Morpurgo also receive knighthoods, and author Jilly Cooper and TV chef Rick Stein become CBEs. Among five honours for the World Cup-winning England Women cricket team is an OBE for captain Heather Knight. Ex-astronaut Helen Sharman joins the Order of St Michael and St George. Alexandra Shulman, who recently stood down as editor of British Vogue after 25 years; actors Hugh Laurie and Susan Hampshire, and leading artificial intelligence researcher Demis Hassabis are made CBEs.
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- Europe > United Kingdom > Scotland (0.05)
- Europe > United Kingdom > Northern Ireland > County Antrim > Ballymena (0.05)
- (4 more...)
- Media > Music (0.86)
- Leisure & Entertainment > Sports > Cricket (0.35)