Goto

Collaborating Authors

 Large Language Model







A distributional simplicity bias in the learning dynamics of transformers

Neural Information Processing Systems

The remarkable capability of over-parameterised neural networks to generalise effectively has been explained by invoking a "simplicity bias": neural networks prevent overfitting by initially learning simple classifiers before progressing to


ImageNet3D: Towards General-Purpose Object-Level 3D Understanding Wufei Ma

Neural Information Processing Systems

A vision model with general-purpose object-level 3D understanding should be capable of inferring both 2D ( e.g., class name and bounding box) and 3D information ( e.g., 3D location and 3D viewpoint) for arbitrary rigid objects in natural




A Additional Results

Neural Information Processing Systems

The acronym dataset is a QA task that requires models to decode financial acronyms. The FinMA7B-full model achieved the highest ROUGE-1 score of 0.12 and the B.1 Why was the datasheet created? B.2 Has the dataset been used already? If so, where are the results so others can compare (e.g., links to published papers)? Y es, the dataset has already been used. It was employed in the FinLLM Share Task during the FinNLP-AgentScen Workshop at IJCAI 2024, known as the FinLLM Challenge.