Monitoring vital signs is essential as they reflect the core functions of the body. Traditional approaches to vital signs monitoring often rely on wearable devices. These contact-based methods can be uncomfortable and impractical for long-term daily use. Therefore, there has been increasing interest in the development of contact-free methods for vital signs monitoring. While existing contact-free methods have demonstrated promising results, they still face practical and fundamental issues that have not been fully addressed by the research community and are difficult to resolve using conventional analytical techniques. However, recent advances in artificial intelligence (AI), especially in the field of deep learning, offer new potential solutions to overcome these obstacles. This thesis aims to harness AI to develop innovative solutions for existing issues in contact-free vital signs monitoring, with the goal of further enhancing the accuracy and reliability of these systems in real-world settings. First, this thesis lays the foundation for the monitoring of contact-free vital signs with AI by addressing an issue in the field of representation learning and signal processing. Specifically, in many practical scenarios of signal extraction from a nonlinear mixture, only one (signal) source is intended to be extracted. However, modern methods involving blind source separation are inefficient for this task since they are designed to recover all sources in the mixture. Therefore, the Supervised Variational Component Decoder (sVCD) is proposed in this thesis as a method designed for the extraction of a single source from nonlinear mixture. The sVCD leverages the sequence-to-sequence (Seq2Seq) translation ability of a specially de signed neural network to approximate a nonlinear inverse of the mixture process, assisted by priors of the interested source. To maintain the robustness in the face of real-world samples, sVCD combines Seq2Seq with variational inference to form a deep generative model, and it is trained by optimizing a variant of variational bound on the data likelihood concerning only the interested source. sVCD has demonstrated superior performance on nonlinear source extraction over a state-of the-art method on two different datasets including artificially generated sequences and electroencephalogram results. Building on this foundation, the concept of sVCD is applied to address the issue of obtaining fine-grained heartbeat waveform from contact-free radio frequency (RF) sensing, wherein heartbeat and respiration are co-located and non-linearly mixed. In recent years, RF-based contact-free vital signs monitoring has gained significant attention due to its non-intrusive nature, noise resistance, and low cost. While most existing system can only monitor respiration or retrieve heart rate, few are capable of recovering the fine-grained heartbeat waveform. The major reason is that, although both respiration and heartbeat cause detectable micro-motions on human bodies, the former is so strong that it overwhelms the latter. Additionally, the results of applying several mainstream methods are unsatisfactory, primarily because they assume linear signal mixing, whereas the composition between respiration and heartbeat can be highly nonlinear. To overcome the difficulty of decomposing the nonlinear signal mixing, a powerful novel deep generative model named variational encoder-decoder (VED) is proposed. Exploiting the universal approximation ability of deep neural networks and the generative potential of variational inference, VED demonstrates a promising capability in recovering fine-grained heartbeat waveforms from RF-sensing signals. Given that the VED is designed to address the issue of mutual interference in single-person scenarios, this thesis advances to a more complex situation: multi person vital signs monitoring, and focus on multi-person respiration monitoring. Currently, most RF-based respiration monitoring solutions only address single person scenarios due to the mutual interference among multiple individuals’ respiration signals. A few proposals simply apply blind source separation to handle inter-person interference without drawing a clear line between physical and algorithmic separability. Furthermore, there is a need to determine the conditions under which multiple respiration signals sensed by diverse RF technologies can be physically separated. Drawing inspiration from conventional signal processing, a novel metric termed respiration-to-interference-plus-noise ratio (RINR) is proposed, takes into account the impact from both background noise and various interfering sources. Instead of attenuation in Euclidean distance, RINR has to be evaluated upon range/angle bins where physical separation actually takes place. As signal attenuation has never been modeled in this manner, a deep learning model is proposed to fit a spread function upon range/angle bins. The resulting RINR model allows us to concretely indicate the limit of physical separability of RF-based multi-person respiration monitoring. RINR demonstrated the bene f its of its employment as a guideline for conducting respiration monitoring with different RF technologies. These AI-driven solutions—sVCD, VED, and RINR model—enhance the effective ness and usability of contact-free monitoring, offering more convenient alternatives to traditional contact-based methods. The integration of AI with contact-free sensing has opened new avenues for future research and applications, holding the potential for significant advancements in real-world healthcare monitoring.
Guohua LuFang YangXijing JingXiao YuHua ZhangHuijun XueJianqi Wang
Mayank KumarAshok VeeraraghavanAshutosh Sabharwal