Not enough data to create a plot.
Try a different view from the menu above.
PersonalSum: A User-Subjective Guided Personalized Summarization Dataset for Large Language Models
With the rapid advancement of Natural Language Processing in recent years, numerous studies have shown that generic summaries generated by Large Language Models (LLMs) can sometimes surpass those annotated by experts, such as journalists, according to human evaluations. However, there is limited research on whether these generic summaries meet the individual needs of ordinary people. The biggest obstacle is the lack of human-annotated datasets from the general public. Existing work on personalized summarization often relies on pseudo datasets created from generic summarization datasets or controllable tasks that focus on specific named entities or other aspects, such as the length and specificity of generated summaries, collected from hypothetical tasks without the annotators' initiative. To bridge this gap, we propose a high-quality, personalized, manually annotated abstractive summarization dataset called PersonalSum. This dataset is the first to investigate whether the focus of public readers differs from the generic summaries generated by LLMs. It includes user profiles, personalized summaries accompanied by source sentences from given articles, and machine-generated generic summaries along with their sources. We investigate several personal signals -- entities/topics, plot, and structure of articles--that may affect the generation of personalized summaries using LLMs in a few-shot in-context learning scenario. Our preliminary results and analysis indicate that entities/topics are merely one of the key factors that impact the diverse preferences of users, and personalized summarization remains a significant challenge for existing LLMs.
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
Novel View Synthesis (NVS) and 3D generation have recently achieved prominent improvements. However, these works mainly focus on confined categories or synthetic 3D assets, which are discouraged from generalizing to challenging in-the-wild scenes and fail to be employed with 2D synthesis directly. Moreover, these methods heavily depended on camera poses, limiting their real-world applications. To overcome these issues, we propose MVInpainter, re-formulating the 3D editing as a multi-view 2D inpainting task. Specifically, MVInpainter partially inpaints multi-view images with the reference guidance rather than intractably generating an entirely novel view from scratch, which largely simplifies the difficulty of in-the-wild NVS and leverages unmasked clues instead of explicit pose conditions. To ensure cross-view consistency, MVInpainter is enhanced by video priors from motion components and appearance guidance from concatenated reference key&value attention. Furthermore, MVInpainter incorporates slot attention to aggregate high-level optical flow features from unmasked regions to control the camera movement with pose-free training and inference. Sufficient scene-level experiments on both object-centric and forward-facing datasets verify the effectiveness of MVInpainter, including diverse tasks, such as multiview object removal, synthesis, insertion, and replacement.
Self-Supervised Generalisation with Meta Auxiliary Learning
Shikun Liu, Andrew Davison, Edward Johns
Learning with auxiliary tasks can improve the ability of a primary task to generalise. However, this comes at the cost of manually labelling auxiliary data. We propose a new method which automatically learns appropriate labels for an auxiliary task, such that any supervised learning task can be improved without requiring access to any further data. The approach is to train two neural networks: a label-generation network to predict the auxiliary labels, and a multi-task network to train the primary task alongside the auxiliary task. The loss for the label-generation network incorporates the loss of the multi-task network, and so this interaction between the two networks can be seen as a form of meta learning with a double gradient. We show that our proposed method, Meta AuXiliary Learning (MAXL), outperforms single-task learning on 7 image datasets, without requiring any additional data. We also show that MAXL outperforms several other baselines for generating auxiliary labels, and is even competitive when compared with human-defined auxiliary labels. The self-supervised nature of our method leads to a promising new direction towards automated generalisation. Source code can be found at https://github.com/lorenmt/maxl.
raised, we would first like to emphasise that the primary objective of this work was not to achieve a new state-of-the-art
Thank you to all three reviewers for your positive and constructive feedback. Here, we see a much more dramatic improvement of MAXL over single-task learning (around 4-6%). On the rightmost column, we show the test accuracy. MAXL improved performance over single-task learning, showing that MAXL is robust to choice of this hyperparameter. Our preliminary findings do, however, show that there is no one value for ฯ which consistently outperforms all others.
Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials Zeyi Sun
Physically realistic materials are pivotal in augmenting the realism of 3D assets across various applications and lighting conditions. However, existing 3D assets and generative models often lack authentic material properties. Manual assignment of materials using graphic software is a tedious and time-consuming task.
ControlSynth Neural ODEs: Modeling Dynamical Systems with Guaranteed Convergence
Neural ODEs (NODEs) are continuous-time neural networks (NNs) that can process data without the limitation of time intervals. They have advantages in learning and understanding the evolution of complex real dynamics. Many previous works have focused on NODEs in concise forms, while numerous physical systems taking straightforward forms, in fact, belong to their more complex quasi-classes, thus appealing to a class of general NODEs with high scalability and flexibility to model those systems. This, however, may result in intricate nonlinear properties. In this paper, we introduce ControlSynth Neural ODEs (CSODEs). We show that despite their highly nonlinear nature, convergence can be guaranteed via tractable linear inequalities. In the composition of CSODEs, we introduce an extra control term for learning the potential simultaneous capture of dynamics at different scales, which could be particularly useful for partial differential equation-formulated systems. Finally, we compare several representative NNs with CSODEs on important physical dynamics under the inductive biases of CSODEs, and illustrate that CSODEs have better learning and predictive abilities in these settings.
91ba4a4478a66bee9812b0804b6f9d1b-AuthorFeedback.pdf
Q1: Why is Full-Batch outperformed by LADIES? A1: It is true that LADIES is designed as an approximation of original GCN. The reason is: real graphs are often noisy and incomplete. Figure 1: Experiments on the PubMed dataset, which contains 19717 nodes and 44338 edges. Q1: "Similar names generate several misunderstandings and confusions" A1: We apologize the title causes confusing.
Finding Friend and Foe in Multi-Agent Games
Jack Serrino, Max Kleiman-Weiner, David C. Parkes, Josh Tenenbaum
Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge is a key game mechanism in hidden role games. Here we develop the DeepRole algorithm, a multi-agent reinforcement learning agent that we test on The Resistance: Avalon, the most popular hidden role game. DeepRole combines counterfactual regret minimization (CFR) with deep value networks trained through self-play.
912d2b1c7b2826caf99687388d2e8f7c-AuthorFeedback.pdf
We thank all three reviewers for their comments and insightful suggestions. We outline some of these changes here. Our approach uses CFR instead of MCTS. We've added the following sentence: "Compared to Does the proposed method generalize to other games such as werewolf or saboteur?... Do we actually want to a DeepRole could be applied directly to Saboteur. We mention in the discussion: "In future Need ablation and analysis -- we all know trained agents are vulnerable to adversarial human players -- e.g. the Another interesting observation is the bot does not need conversation.