Not enough data to create a plot.
Try a different view from the menu above.
Multi-layer State Evolution Under Random Convolutional Design
Signal recovery under generative neural network priors has emerged as a promising direction in statistical inference and computational imaging. Theoretical analysis of reconstruction algorithms under generative priors is, however, challenging. For generative priors with fully connected layers and Gaussian i.i.d.
empty PointMamba: A Simple State Space Model for Point Cloud Analysis
Transformers have become one of the foundational architectures in point cloud analysis tasks due to their excellent global modeling ability. However, the attention mechanism has quadratic complexity, making the design of a linear complexity method with global modeling appealing. In this paper, we propose PointMamba, transferring the success of Mamba, a recent representative state space model (SSM), from NLP to point cloud analysis tasks. Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs.
Solver-in-the-Loop: Learning from Differentiable Physics to Interact with Iterative PDE-Solvers
Kiwon Um, Robert Brand, Yun (Raymond) Fei, Philipp Holl, Nils Thuerey
Finding accurate solutions to partial differential equations (PDEs) is a crucial task in all scientific and engineering disciplines. It has recently been shown that machine learning methods can improve the solution accuracy by correcting for effects not captured by the discretized PDE. We target the problem of reducing numerical errors of iterative PDE solvers and compare different learning approaches for finding complex correction functions. We find that previously used learning approaches are significantly outperformed by methods that integrate the solver into the training loop and thereby allow the model to interact with the PDE during training. This provides the model with realistic input distributions that take previous corrections into account, yielding improvements in accuracy with stable rollouts of several hundred recurrent evaluation steps and surpassing even tailored supervised variants. We highlight the performance of the differentiable physics networks for a wide variety of PDEs, from non-linear advection-diffusion systems to threedimensional Navier-Stokes flows.
43e4e6a6f341e00671e123714de019a8-AuthorFeedback.pdf
We appreciate the reviewer's valuable comments, and we were glad to read the positive comments regarding the We also appreciate the thorough feedback for further improvements. Review 1: What would be a real use case? What is trained in the PRE-approach? Is there benefit in using the differentiable PDE solver? Do steps of a differentiable simulator correspond to time steps?
Appendix: Structured Reordering for Modeling Latent Alignments in Sequence Transduction ] for each segment, which is the total weight of all derivations with root X
WCFG to PCFG Conversion The algorithm of converting a WCFG to its equivalent PCFG is shown in Algorithm 1. Full proof of this equivalence can be found in Smith and Johnson [1]. Proof of the Dynamic Programming for Marginal Inference We prove the correctness of the dynamic programming algorithm for computing the marginal permutation matrix of separable permutations by induction as follows. As a base case, each word (i.e., segment with length 1) is associated with an identity permutation matrix 1. Architecture and Hyperparameters The detailed architecture of ReMoto is shown in Figure 1. Figure 1: The detailed architecture of our seq2seq model for semantic parsing (view in color). First, the structured reordering module genearates a (relaxed) permutation matrix given the input utterrance. Then, the encoding module generates the representations of the input utterance based on the reordered embeddings, which are computed based on the original embedding and the permutation matrix computed in the first step.
Fast Best-of-N Decoding via Speculative Rejection Hanshi Sun
The safe and effective deployment of Large Language Models (LLMs) involves a critical step called alignment, which ensures that the model's responses are in accordance with human preferences. Prevalent alignment techniques, such as DPO, PPO and their variants, align LLMs by changing the pre-trained model weights during a phase called post-training. While predominant, these post-training methods add substantial complexity before LLMs can be deployed. Inference-time alignment methods avoid the complex post-training step and instead bias the generation towards responses that are aligned with human preferences. The bestknown inference-time alignment method, called Best-of-N, is as effective as the state-of-the-art post-training procedures. Unfortunately, Best-of-N requires vastly more resources at inference time than standard decoding strategies, which makes it computationally not viable.
Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare 2
While recent advancements in large multimodal models (LMMs) have significantly improved their abilities in image quality assessment (IQA) relying on absolute quality rating, how to transfer reliable relative quality comparison outputs to continuous perceptual quality scores remains largely unexplored. To address this gap, we introduce Compare2Score--an all-around LMM-based no-reference IQA (NR-IQA) model, which is capable of producing qualitatively comparative responses and effectively translating these discrete comparative levels into a continuous quality score. Specifically, during training, we present to generate scaled-up comparative instructions by comparing images from the same IQA dataset, allowing for more flexible integration of diverse IQA datasets. Utilizing the established large-scale training corpus, we develop a human-like visual quality comparator. During inference, moving beyond binary choices, we propose a soft comparison method that calculates the likelihood of the test image being preferred over multiple predefined anchor images. The quality score is further optimized by maximum a posteriori estimation with the resulting probability matrix. Extensive experiments on nine IQA datasets validate that the Compare2Score effectively bridges text-defined comparative levels during training with converted single image quality score for inference, surpassing state-of-the-art IQA models across diverse scenarios. Moreover, we verify that the probability-matrix-based inference conversion not only improves the rating accuracy of Compare2Score but also zero-shot general-purpose LMMs, suggesting its intrinsic effectiveness.
Supplementary Material: Progressive Coordinate Transforms for Monocular 3D Object Detection
In this supplementary material, we provide additional experimental results and qualitative visualizations. Specifically, we demonstrate the impacts of using different off-the-shelf models in Sec. 2, including 2D detectors and depth estimators. We show that our proposed PCT method achieves consistent improvements with all configurations. Additional qualitative results are visualized in Sec. 3. We present both successful predictions and failure cases. Our results suggest that these failure cases can often come from two aspects, low recall of the 2D detectors and the rotation errors in 3D box prediction.