Lifelogging is the passive collection, storage and analysis of daily data through wearable sensors. Question Answering (QA) for lifelog data enables natural language interactions with personal daily life records, providing insights into individual routines and behaviours. While this task has great potential for personal analytics and memory augmentation, progress has been limited due to the challenges of lifelog management, since they can comprise of enormous multi-modal data sets spanning a lifetime. We introduce a Retrieval-Augmented Generation (RAG) approach for addressing the lifelog QA task. A RAG approach first includes a retrieval model finding the correct lifelog events containing answers and then a large language model (LLM) generating answers from the questions. In addition, we construct an open-ended lifelog QA benchmark with 14,187 QA pairs to examine the RAG approach to lifelog QA. Using an embedding-based retrieval approach, our lifelog context retriever achieves a performance of 77.67% Recall@5 and 94.35% Recall@20 using an embedding-based retrieval approach with the Stella 1.5B model. Combined with the Mistral 7B model, the model achieves scores of 39.54% ROUGE-L and 3.475 Accuracy on a scale of 5 scored by GPT-4o. This approach potentially provides an effective approach to lifelog QA with high performance that does not require fine-tuning.
Haozheng LuoRuiyang QinChenwei XuGuo YeZening Luo
Yiming XuLin ChenZhongwei ChengLixin DuanJiebo Luo
Zhou ZhaoShuwen XiaoZehan SongChujie LuJun XiaoYueting Zhuang
Sumedh PendurkarSameer KolpekwarShreyas DhootYashodhara HaribhaktaBiplab Banerjee
Ly-Duyen TranThanh Cong HoLan Anh PhamBinh T. NguyenCathal GurrinLiting Zhou