Not enough data to create a plot.
Try a different view from the menu above.
AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation Boyu Han 1,2 Zhiyong Yang 2
The Area Under the ROC Curve (AUC) is a well-known metric for evaluating instance-level long-tail learning problems. In the past two decades, many AUC optimization methods have been proposed to improve model performance under long-tail distributions. In this paper, we explore AUC optimization methods in the context of pixel-level long-tail semantic segmentation, a much more complicated scenario. This task introduces two major challenges for AUC optimization techniques. On one hand, AUC optimization in a pixel-level task involves complex coupling across loss terms, with structured inner-image and pairwise inter-image dependencies, complicating theoretical analysis. On the other hand, we find that mini-batch estimation of AUC loss in this case requires a larger batch size, resulting in an unaffordable space complexity.
Supplementary Materials for MEQA: A Benchmark for Multi-hop Event-centric Question Answering with Explanations
We utilize an open and widely used data format, i.e., JSON format, for the MEQA dataset. A sample within the dataset, accompanied by the data format explanation, is shown in Listing 1. " context ": " Roadside IED kills Russian major general [...] ", # The context of the question " question ": " Who died before AI - monitor reported it online?", " What event contains Al - Monitor is the communicator? " What event is after #1 has a victim? " Who died in the #2? major general, local commander, lieutenant general " The dataset and source code for the MEQA dataset have been released to GitHub: https:// github.com/du-nlp-lab/MEQA.
A Diverse and Multilingual Benchmark for Cross-File Code Completion
Code completion models have made significant progress in recent years, yet current popular evaluation datasets, such as HumanEval and MBPP, predominantly focus on code completion tasks within a single file. This over-simplified setting falls short of representing the real-world software development scenario where repositories span multiple files with numerous cross-file dependencies, and accessing and understanding cross-file context is often required to complete the code correctly.
Entrywise error bounds for low-rank approximations of kernel matrices
In this paper, we derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition (or singular value decomposition). While this approximation is well-known to be optimal with respect to the spectral and Frobenius norm error, little is known about the statistical behaviour of individual entries. Our error bounds fill this gap. A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues, which takes inspiration from the field of Random Matrix Theory.
Fine-grained Expressivity of Graph Neural Networks
Numerous recent works have analyzed the expressive power of message-passing graph neural networks (MPNNs), primarily utilizing combinatorial techniques such as the 1-dimensional Weisfeiler-Leman test (1-WL) for the graph isomorphism problem. However, the graph isomorphism objective is inherently binary, not giving insights into the degree of similarity between two given graphs.
IPO: Interpretable Prompt Optimization for Vision-Language Models 1 1 AIM Lab, University of Amsterdam 2
Pre-trained vision-language models like CLIP have remarkably adapted to various downstream tasks. Nonetheless, their performance heavily depends on the specificity of the input text prompts, which requires skillful prompt template engineering. Instead, current approaches to prompt optimization learn the prompts through gradient descent, where the prompts are treated as adjustable parameters. However, these methods tend to lead to overfitting of the base classes seen during training and produce prompts that are no longer understandable by humans. This paper introduces a simple but interpretable prompt optimizer (IPO), that utilizes large language models (LLMs) to generate textual prompts dynamically.