Abstract Nowadays, Wordle, a daily puzzle game offered by the New York Times, is now becoming prevalent around the world by storm, and difficulty analysis of words has become a hot topic. However, the difficulty analysis of words is highly sensitive to the feature extraction of words, which undoubtedly creates great difficulties. Based on this, this paper proposes an improved K-means++ algorithm for clustering analysis of word difficulty, which is simulated and verified in the U.S. college students’ mathematical modeling C problem data. The results show that this paper clusters the 359 words in the data into 6 classes A, B, C, D, E, and F, and determines the class of word ERIE as B based on the distance between ERRIE and the word attributes of the six clustering centers. The DB index value of the model is as high as 0.815, and the results of the analysis of variance (ANOVA) of the clustering index are very significant, which proves the high accuracy of the model.
Yin Sheng ZhangHui Lin ShanJia Qiang LiJie Zhou
Zhe ZhangJunxi ZhangHuifeng Xue