Survey of Cultural Awareness in Language Models: Text and Beyond
Pawar, Siddhesh, Park, Junyeong, Jin, Jiho, Arora, Arnav, Myung, Junho, Yadav, Srishti, Haznitrama, Faiz Ghifari, Song, Inhwa, Oh, Alice, Augenstein, Isabelle
–arXiv.org Artificial Intelligence
Large-scale deployment of large language models (LLMs) in various applications, such as chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure inclusivity. Culture has been widely studied in psychology and anthropology, and there has been a recent surge in research on making LLMs more culturally inclusive in LLMs that goes beyond multilinguality and builds on findings from psychology and anthropology. In this paper, we survey efforts towards incorporating cultural awareness into text-based and multimodal LLMs. We start by defining cultural awareness in LLMs, taking the definitions of culture from anthropology and psychology as a point of departure. We then examine methodologies adopted for creating cross-cultural datasets, strategies for cultural inclusion in downstream tasks, and methodologies that have been used for benchmarking cultural awareness in LLMs. Further, we discuss the ethical implications of cultural alignment, the role of Human-Computer Interaction in driving cultural inclusion in LLMs, and the role of cultural alignment in driving social science research. We finally provide pointers to future research based on our findings about gaps in the literature.
arXiv.org Artificial Intelligence
Oct-30-2024
- Country:
- Oceania > Australia (0.04)
- South America
- Brazil (0.04)
- Guyana (0.04)
- Chile > Santiago Metropolitan Region
- Santiago Province > Santiago (0.04)
- North America
- Dominican Republic (0.04)
- Central America (0.04)
- United States
- Maryland > Baltimore (0.04)
- Illinois (0.04)
- Texas > Travis County
- Austin (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Washington > King County
- Seattle (0.27)
- California
- San Francisco County > San Francisco (0.13)
- Los Angeles County > Long Beach (0.04)
- New York > New York County
- New York City (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- Canada
- Ontario > Toronto (0.04)
- Quebec > Montreal (0.04)
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.14)
- Europe
- Austria > Vienna (0.13)
- Romania (0.04)
- Germany > Hamburg (0.04)
- Greece (0.04)
- Eastern Europe (0.04)
- United Kingdom (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Spain
- Basque Country (0.04)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Denmark > Capital Region
- Copenhagen (0.04)
- Switzerland
- Zürich > Zürich (0.13)
- Basel-City > Basel (0.04)
- Bulgaria > Sofia City Province
- Sofia (0.04)
- France
- Île-de-France > Paris
- Paris (0.04)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Île-de-France > Paris
- Italy
- Veneto > Venice (0.04)
- Trentino-Alto Adige/Südtirol > Trentino Province
- Trento (0.04)
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- South Korea (0.14)
- North Korea (0.13)
- Singapore (0.05)
- India (0.04)
- Taiwan (0.04)
- Southeast Asia (0.04)
- Nepal (0.04)
- East Asia (0.04)
- Central Asia (0.04)
- Macao (0.04)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- China
- Jiangsu Province > Yancheng (0.04)
- Hong Kong (0.04)
- Beijing > Beijing (0.04)
- Middle East
- Indonesia
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- Africa
- Sudan (0.04)
- Kenya (0.04)
- North Africa (0.04)
- Middle East > Egypt (0.04)
- Eswatini > Manzini
- Manzini (0.04)
- Central African Republic > Ombella-M'Poko
- Bimbo (0.04)
- Genre:
- Research Report > New Finding (1.00)
- Overview (1.00)
- Industry:
- Leisure & Entertainment (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Government (0.92)
- Information Technology > Services (0.67)
- Media
- Education > Educational Setting
- K-12 Education (1.00)
- Technology: