Download PDFOpen PDF in browserAssessing the Necessity and Impact of Localized Traditional Chinese Function Calling BenchmarksEasyChair Preprint 152006 pages•Date: October 6, 2024AbstractThe function calling capability of Large Language Models (LLMs) is becoming indispensable for their practical applications. For LLMs to be successfully applied to localized commercial use, function calling refers to the ability to invoke external tools to obtain real-time information or interact with additional functionalities. To develop or select the ideal models for these tasks, it is crucial to understand the importance of benchmark localization. In this study, we introduce our recreation of a Taiwan-specific standardized function-calling benchmark, adapted from the Gorilla functioncalling framework for evaluating tool calls in English. Through experimental evaluation utilizing our formed data, question-answer scoring mechanisms, and additional tools for multilingual performance comparison, we successfully completed the zh-TW localization process and assessed its differences compared to the English evaluation. This highlights the necessity of evaluating local Traditional Chinese performance, as it provides a clearer perspective on localized applications in commercial contexts and other fields in Taiwan. Keyphrases: LLM Function Calling, Large Language Model, Traditional Chinese
|