Large Language Model
SAFEWORLD: Geo-DiverseSafetyAlignment
Despite significant progress inthisarea, anessential factor often remains overlooked:geo-diversity. Recognizing and incorporating geographical variations [41, 40, 4, 10, 31, 6] in safety principles is crucial in the global landscape of LLM safety. Cultural norms and legal frameworks vary widely, resulting in diverse definitions of safe and acceptable behavior. As shown in Figure 1, while giving a green hatasagift might bebenign inmanycultures, itisconsidered offensiveinChina.
InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models
With the rapid development of code LLMs, many popular evaluation benchmarks, such as HumanEval, DS-1000, and MBPP, have emerged to measure the performance of code LLMs with a particular focus on code generation tasks. However, they are insufficient to cover the full range of expected capabilities of code LLMs, which span beyond code generation to answering diverse coding-related questions.