Noh, Heewoong
Image is All You Need: Towards Efficient and Effective Large Language Model-Based Recommender Systems
Kim, Kibum, Kim, Sein, Kang, Hongseok, Kim, Jiwan, Noh, Heewoong, In, Yeonjun, Yoon, Kanghoon, Oh, Jinoh, Park, Chanyoung
Large Language Models (LLMs) have recently emerged as a powerful backbone for recommender systems. Existing LLM-based recommender systems take two different approaches for representing items in natural language, i.e., Attribute-based Representation and Description-based Representation. In this work, we aim to address the trade-off between efficiency and effectiveness that these two approaches encounter, when representing items consumed by users. Based on our interesting observation that there is a significant information overlap between images and descriptions associated with items, we propose a novel method, Image is all you need for LLM-based Recommender system (I-LLMRec). Our main idea is to leverage images as an alternative to lengthy textual descriptions for representing items, aiming at reducing token usage while preserving the rich semantic information of item descriptions. Through extensive experiments, we demonstrate that I-LLMRec outperforms existing methods in both efficiency and effectiveness by leveraging images. Moreover, a further appeal of I-LLMRec is its ability to reduce sensitivity to noise in descriptions, leading to more robust recommendations.
3D Interaction Geometric Pre-training for Molecular Relational Learning
Lee, Namkyeong, Oh, Yunhak, Noh, Heewoong, Na, Gyoung S., Xu, Minkai, Wang, Hanchen, Fu, Tianfan, Park, Chanyoung
Molecular relational learning (MRL) focuses on understanding the interaction dynamics between molecules and has gained significant attention from researchers thanks to its diverse applications [20]. For instance, understanding how a medication dissolves in different solvents (medication-solvent interaction) is vital in pharmacy [30, 26, 3], while predicting the optical and photophysical properties of chromophores in various solvents (chromophore-solvent interaction) is essential for material discovery [16]. Because of the expensive time and financial costs associated with conducting wet lab experiments to test the interaction behavior of all possible molecular pairs [31], machine learning methods have been quickly embraced for MRL. Despite recent advancements in MRL, previous works tend to ignore molecules' 3D geometric information and instead focus solely on their 2D topological structures. However, in molecular science, the 3D geometric information of molecules (Figure 1 (a)) is crucial for understanding and predicting molecular behavior across various contexts, ranging from physical properties [1] to biological functions [10, 46]. This is particularly important in MRL, as geometric information plays a key role in molecular interactions by determining how molecules recognize, interact, and bind with one another in their interaction environment [34]. In traditional molecular dynamics simulations, explicit solvent models, which directly consider the detailed environment of molecular interaction, have demonstrated superior performance compared to implicit solvent models, which simplify the solvent as a continuous medium, highlighting the significance of explicitly modeling the complex geometries of interaction environments [47]. However, acquiring stereochemical structures of molecules is often very costly, resulting in limited availability of such 3D geometric information for downstream tasks [23].
Retrieval-Retro: Retrieval-based Inorganic Retrosynthesis with Expert Knowledge
Noh, Heewoong, Lee, Namkyeong, Na, Gyoung S., Park, Chanyoung
While inorganic retrosynthesis planning is essential in the field of chemical science, the application of machine learning in this area has been notably less explored compared to organic retrosynthesis planning. In this paper, we propose Retrieval-Retro for inorganic retrosynthesis planning, which implicitly extracts the precursor information of reference materials that are retrieved from the knowledge base regarding domain expertise in the field. Specifically, instead of directly employing the precursor information of reference materials, we propose implicitly extracting it with various attention layers, which enables the model to learn novel synthesis recipes more effectively. Moreover, during retrieval, we consider the thermodynamic relationship between target material and precursors, which is essential domain expertise in identifying the most probable precursor set among various options. Extensive experiments demonstrate the superiority of Retrieval-Retro in retrosynthesis planning, especially in discovering novel synthesis recipes, which is crucial for materials discovery. The source code for Retrieval-Retro is available at https://github.com/HeewoongNoh/Retrieval-Retro.
Density of States Prediction of Crystalline Materials via Prompt-guided Multi-Modal Transformer
Lee, Namkyeong, Noh, Heewoong, Kim, Sungwon, Hyun, Dongmin, Na, Gyoung S., Park, Chanyoung
The density of states (DOS) is a spectral property of crystalline materials, which provides fundamental insights into various characteristics of the materials. While previous works mainly focus on obtaining high-quality representations of crystalline materials for DOS prediction, we focus on predicting the DOS from the obtained representations by reflecting the nature of DOS: DOS determines the general distribution of states as a function of energy. That is, DOS is not solely determined by the crystalline material but also by the energy levels, which has been neglected in previous works. In this paper, we propose to integrate heterogeneous information obtained from the crystalline materials and the energies via a multi-modal transformer, thereby modeling the complex relationships between the atoms in the crystalline materials and various energy levels for DOS prediction. Moreover, we propose to utilize prompts to guide the model to learn the crystal structural system-specific interactions between crystalline materials and energies. Extensive experiments on two types of DOS, i.e., Phonon DOS and Electron DOS, with various real-world scenarios demonstrate the superiority of DOSTransformer.
Stoichiometry Representation Learning with Polymorphic Crystal Structures
Lee, Namkyeong, Noh, Heewoong, Na, Gyoung S., Fu, Tianfan, Sun, Jimeng, Park, Chanyoung
Despite the recent success of machine learning (ML) in materials science, its success heavily relies on the structural description of crystal, which is itself computationally demanding and occasionally unattainable. Stoichiometry descriptors can be an alternative approach, which reveals the ratio between elements involved to form a certain compound without any structural information. However, it is not trivial to learn the representations of stoichiometry due to the nature of materials science called polymorphism, i.e., a single stoichiometry can exist in multiple structural forms due to the flexibility of atomic arrangements, inducing uncertainties in representation. To this end, we propose PolySRL, which learns the probabilistic representation of stoichiometry by utilizing the readily available structural information, whose uncertainty reveals the polymorphic structures of stoichiometry. Extensive experiments on sixteen datasets demonstrate the superiority of PolySRL, and analysis of uncertainties shed light on the applicability of PolySRL in real-world material discovery.
Predicting Density of States via Multi-modal Transformer
Lee, Namkyeong, Noh, Heewoong, Kim, Sungwon, Hyun, Dongmin, Na, Gyoung S., Park, Chanyoung
The density of states (DOS) is a spectral property of materials, which provides fundamental insights on various characteristics of materials. In this paper, we propose a model to predict the DOS by reflecting the nature of DOS: DOS determines the general distribution of states as a function of energy. Specifically, we integrate the heterogeneous information obtained from the crystal structure and the energies via multi-modal transformer, thereby modeling the complex relationships between the atoms in the crystal structure, and various energy levels. Extensive experiments on two types of DOS, i.e., Phonon DOS and Electron DOS, with various real-world scenarios demonstrate the superiority of DOSTransformer. Despite the recent progress of machine learning (ML) in materials science, most ML models developed in the field have been focused on material properties consisting of single-valued properties Kong et al. (2022), e.g., band gap energy Lee et al. (2016), formation energy Ward et al. (2016), and Fermi energy Xie & Grossman (2018). On the other hand, spectral properties are ubiquitous in materials science, characterizing various properties of materials, e.g., X-ray absorption, dielectric function, and electronic density of states Kong et al. (2022) (See Figure 1(a)).