Goto

Collaborating Authors

 Large Language Model







Towards Neuron Attributions in Multimodal Large Language Models

Neural Information Processing Systems

As Large Language Models (LLMs) demonstrate impressive capabilities, demys-tifying their internal mechanisms becomes increasingly vital.




Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image Y u Zhao

Neural Information Processing Systems

In the visual spatial understanding (VSU) area, spatial image-to-text (SI2T) and spatial text-to-image (ST2I) are two fundamental tasks that appear in dual form. Existing methods for standalone SI2T or ST2I perform imperfectly in spatial understanding, due to the difficulty of 3D-wise spatial feature modeling.