The scale leap of large language models has promoted the development of natural language processing these days, but Chinese NLP still faces special challenges, including word segmentation difficulties and cultural complexity like idioms and allusions. Also, systematic comparative analysis of mainstream models' performance in Chinese applications remains limited. This study compares and analyzes five representative large language modelsGPT, LLaMA, Claude, ERNIE, and Kimiacross three core Chinese NLP tasks: text generation, machine translation, and text analysis. The research reveals performance and specializations of these models through evaluation of their applications in real-world scenarios. ERNIE demonstrates great cultural understanding and Chinese-specific content generation, while GPT tends to perform better in creative diversity. Claude shows excellent analytical precision and safety standards, LLaMA provides valuable customization flexibility through its open-source nature, and Kimi specializes in long-form document processing. The results reveal that different models have varying performance patterns across different application domains. Model selection requires balancing cultural factors, technical performance, and deployment constraints. These findings advance model selection strategies for Chinese NLP tasks.
Alsu I. KhabibrakhmanovaИ. С. АлексеевР. Ф. Тазиева
Peter WulffMarcus KubschChristina Krist