There are several publicly accessible patient forums where patients can post questions related to their health conditions. The objective of this study was to develop a query-retrieval system that can mine such forums and identify existing questions most similar to the provided question. This pilot study based on a bag-of-words model with latent semantic analysis and cosine similarity suggests that text similarity-based mining holds promise for identification of diabetes-related questions from patient forums and informing self-care management. Further studies involving advance natural language processing tools can be used to reduce false positives and uncover semantically related questions.
Xiaoshuang LiuXinyue MaXiaoyan MengXiang LiGuotong Xie
Zhiguo YuByron WallaceTodd R. Johnson