Cultural Commonsense Knowledge for Intercultural Dialogues
T. Nguyen, S. Razniewski, and G. Weikum. Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, page 1774–1784. New York, NY, USA, Association for Computing Machinery, (Oct 21, 2024)
DOI: 10.1145/3627673.3679768
Abstract
Despite recent progress, large language models (LLMs) still face the challenge of appropriately reacting to the intricacies of social and cultural conventions. This paper presents Mango, a methodology for distilling high-accuracy, high-recall assertions of cultural knowledge. We judiciously and iteratively prompt LLMs for this purpose from two entry points, concepts and cultures. Outputs are consolidated via clustering and generative summarization. Running the Mango method with GPT-3.5 as underlying LLM yields 167K high-accuracy assertions for 30K concepts and 11K cultures, surpassing prior resources by a large margin in quality and size. In an extrinsic evaluation for intercultural dialogues, we explore augmenting dialogue systems with cultural knowledge assertions. Notably, despite LLMs inherently possessing cultural knowledge, we find that adding knowledge from Mango improves the overall quality, specificity, and cultural sensitivity of dialogue responses, as judged by human annotators. Data and code are available for download.
%0 Conference Paper
%1 Nguyen2024
%A Nguyen, Tuan-Phong
%A Razniewski, Simon
%A Weikum, Gerhard
%B Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
%C New York, NY, USA
%D 2024
%I Association for Computing Machinery
%K imported
%P 1774–1784
%R 10.1145/3627673.3679768
%T Cultural Commonsense Knowledge for Intercultural Dialogues
%U https://doi.org/10.1145/3627673.3679768
%X Despite recent progress, large language models (LLMs) still face the challenge of appropriately reacting to the intricacies of social and cultural conventions. This paper presents Mango, a methodology for distilling high-accuracy, high-recall assertions of cultural knowledge. We judiciously and iteratively prompt LLMs for this purpose from two entry points, concepts and cultures. Outputs are consolidated via clustering and generative summarization. Running the Mango method with GPT-3.5 as underlying LLM yields 167K high-accuracy assertions for 30K concepts and 11K cultures, surpassing prior resources by a large margin in quality and size. In an extrinsic evaluation for intercultural dialogues, we explore augmenting dialogue systems with cultural knowledge assertions. Notably, despite LLMs inherently possessing cultural knowledge, we find that adding knowledge from Mango improves the overall quality, specificity, and cultural sensitivity of dialogue responses, as judged by human annotators. Data and code are available for download.
%@ 9798400704369
@inproceedings{Nguyen2024,
abstract = {Despite recent progress, large language models (LLMs) still face the challenge of appropriately reacting to the intricacies of social and cultural conventions. This paper presents Mango, a methodology for distilling high-accuracy, high-recall assertions of cultural knowledge. We judiciously and iteratively prompt LLMs for this purpose from two entry points, concepts and cultures. Outputs are consolidated via clustering and generative summarization. Running the Mango method with GPT-3.5 as underlying LLM yields 167K high-accuracy assertions for 30K concepts and 11K cultures, surpassing prior resources by a large margin in quality and size. In an extrinsic evaluation for intercultural dialogues, we explore augmenting dialogue systems with cultural knowledge assertions. Notably, despite LLMs inherently possessing cultural knowledge, we find that adding knowledge from Mango improves the overall quality, specificity, and cultural sensitivity of dialogue responses, as judged by human annotators. Data and code are available for download.},
added-at = {2024-12-11T10:25:27.000+0100},
address = {New York, NY, USA},
author = {Nguyen, Tuan-Phong and Razniewski, Simon and Weikum, Gerhard},
biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/284bd27300e3703b44ff7bf8b0cedc10e/scadsfct},
booktitle = {Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
day = 21,
doi = {10.1145/3627673.3679768},
interhash = {04fbf4322ea9005c532f33e3e0e8029f},
intrahash = {84bd27300e3703b44ff7bf8b0cedc10e},
isbn = {9798400704369},
keywords = {imported},
location = {Boise, ID, USA},
month = {10},
pages = {1774–1784},
publisher = {Association for Computing Machinery},
series = {CIKM '24},
timestamp = {2024-12-11T10:25:27.000+0100},
title = {Cultural Commonsense Knowledge for Intercultural Dialogues},
url = {https://doi.org/10.1145/3627673.3679768},
year = 2024
}