Supplementary for Paper2Poster: Benchmarking Multimodal Poster Automation from Scientific Papers

Neural Information Processing Systems 

AAblation Study1 We conduct ablation studies to evaluate three key design choices in PosterAgent: (1) the binary-tree2 layout strategy for layout planning; (2) the inclusion of a commenter module as a visual critic; and3 (3) the use of in-context examples to enhance the visual perception capabilities of the commenter.4 We define the following variants:5 Direct: replacing the binary-tree layout with direct layout generation by an LLM;6 Tree: using the binary-tree layout strategy but removing the commenter module;7 Tree + Commenter: including the commenter module but without in-context examples;8 Tree + Commenter + IC: the full system, with both the commenter and in-context examples.9 All ablation variants are implemented using PosterAgent-4o, keeping all other components un-10 changed to isolate the effect of each factor. We visualize and compare results across five randomly11 selected papers from Paper2Poster, as shown in Figures 1 to 5.12 When prompting the LLM to directly generate poster layouts (Direct), the results are often structurally13 compromised (e.g., Figures 1a-3a), or resemble blog-style layouts that lack visual hierarchy and14 appeal (Figures 4a,5a). Fine-grained layout components, such as text boxes and figures, are especially15 challenging to synthesize in this setting: for instance, Figures1a-4a exhibit missing text boxes that16 leave noticeable blank areas, and Figure 4a fails to preserve the correct aspect ratio of figures.17

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found