1 Introduction
Wireless sensing has seen large development and increased interest over the past decade, especially in the fields of indoor localization [
27], activity recognition [
29], and physiological signal [
16,
28,
31] monitoring. The initial research focus of the latter was on those physiological parameters that are most expressed and most easily detectable in a contact-free manner, such as
heart rate (
HR) and
respiratory rate (
RR). In recent years however, the field expanded rapidly towards the monitoring of more subtle parameters, such as
heart rate variability (
HRV),
blood oxygen saturation (
SpO2), and
blood pressure (
BP). This was facilitated by improved accuracy of sensors going hand-in-hand with their reduced cost and subsequent widespread availability, especially in the form of RGB cameras. Alongside hardware development there have also been major algorithmic advancements, especially with the transition towards deep learning, where deep
convolutional neural networks (
CNNs) and transformer architectures dominate the current state-of-the-art [
5,
14]. These developments are relevant in the context of
AI of Things (
AIoT), as the ever more present neural networks are one of the manifestations of transition from IoT to AIoT. The latter has especially large potential in the healthcare domain [
26]. Contact-free systems, similar to the one proposed in our work, can form an interconnected network of intelligent systems with many benefits. For example, they monitor patients, give intelligent suggestions for disease management or prevention, and offer decision support, all while being non-invasive and not interfering or obstructing the patients everyday lives.
The main enabler at the forefront of contact-free physiological monitoring remains
remote photoplethysmography (
rPPG), which reflects cardiovascular activity and blood perfusion via color changes of the skin [
2]. These can be captured using a simple setup comprising an RGB camera and a light source. State-of-the-art research uses rPPG reconstruction waveforms to estimate some highly expressed physiological parameters, such as HR, in a robust and accurate manner, achieving errors as low as 1-3
beats-per-minute (
BPM) [
5]. Other more subtle parameters, such as BP, remain difficult to estimate while being highly valuable for both physicians and patients. Attempts at BP estimation include either analysis of rPPG waveform morphology or measurement of
pulse transit time (
PTT) between multiple sites (e.g., palm and forehead) [
24]. PTT is a surrogate metric often used when BP cannot be measured directly (e.g., due to sensor limitations, for instance having just optical sensors), and is known to be well-correlated with BP.
BP is the most commonly measured physiological parameter both at primary and secondary healthcare, as well as at home. Traditional wide-spread cuff-based measurement is cumbersome, induces subject stress (white-coat syndrome), requires following specific protocols, and can lead to inaccurate or lack of (home) measurement entirely. Subsequently, more unobtrusive organic measurement is desirable. This is being addressed via the previously mentioned contact-free BP estimation, where most commonly rPPG is reconstructed via skin color changes using an RGB camera. If only a single rPPG signal is reconstructed, it is used to compute morphological features by either defining a feature vector based on cycle reference points, or by using a black box model (e.g., neural network) to derive features internally. These are then correlated to BP by a data-driven model. When two rPPG signals are reconstructed from different locations, delay between reference points is measured, as this PTT is medically known to be correlated to BP. The latter is challenging, as two precisely synchronized sensors are required and two regions of interest must be precisely monitored at once. Most commonly one of these is the face, which also comes with privacy preservation issues. BP estimation from PPG, in general, remains an active research area in both contact and contact-free monitoring, and has been subject to several challenges, both technical and fundamental. The former are being addressed successfully via sensor improvements and noise removal algorithms. However, fundamentally waveform analysis remains challenging as the connection between features (often obtained from black-box models [
5]) and BP is less clear [
24]. Additionally, a general predictive model is difficult to obtain, as many researchers agree that hemodynamics governing BP are highly person-specific due to specifics of both blood and vascular structure, thus requiring (re)calibration of models to each individual [
25]. PTT on the other hand is well-established and the connection to BP is well understood, however, requiring several measurement sites and sensors is a limitation and a burden to the user.
Instead of using traditional multi-site PTT measurement technique, we evaluate the feasibility of a single-site contact-free
multi-wavelength (
MW) approach in which we measure PTT between different depths of skin instead of different spatial locations. This would allow for privacy-preserving unobtrusive monitoring that minimizes the user burden and uses more information from a single site by monitoring the blood flow through the skin layers instead of between different spatial locations. It also removes the requirement for two sensors at different locations and corresponding regions with skin exposure, circumventing synchronization, or requirement for specific position of a subject (e.g., palm raised next to face) that is required for monitoring of two distant sites. Additionally, it importantly improves on existing MW contact approaches, since even slight skin contact causes some compression that influences the waveform [
20,
21]. Such an approach fundamentally differs from existing work in BP estimation, which is based on either contact single-site MW PTT (traditional wearables approach) [
18,
19], remote PPG morphology analysis (connection between features and BP not medically clear) [
5], or remote multi-site PTT (requires several measuring sites with skin exposure, less feasible in practice) [
22].
We also investigate the correlations between the observed PTTs and recorded ground-truth systolic (SBP) and diastolic (DBP) BP in an experiment involving 9 subjects in two scenarios, eliciting distinctly different BP values. Finally, we evaluate explicit BP prediction capabilities by using a regression model to predict SBP and DBP. Additionally, we propose a vital, yet to the best of our knowledge always overlooked, color channel separation method that is crucial for the observation of PTTs, which are inherently very short in a single-site MW approach, due to the short distance the blood traverses.
Such a contact-free single-site MW approach would offer numerous advantages compared to the current “golden standard” for BP measurement, which remains the use of a cuff-based sphygmomanometer. The proposed alternative could allow for simple contact-free BP screenings, for example daily measurements when a person is in front of a bathroom mirror. It would omit the need for a trained person to place the cuff and monitor the measurement, be more comfortable, and potentially have less influence on the measurement result (e.g., white coat syndrome) – all while using just a single RGB sensor. Additionally, since the PTTs being measured are very short, the waveforms must not be influenced or distorted in any way. While subtle, such influence is present when using contact sensors due to slight skin compression [
20]. Researchers in this field have also shown that leveraging information from the infrared (deeper skin layer) is valuable as pulsatility is more stable [
12,
21]. Due to these shortcomings, there is an apparent need for a novel contact-free approach that also uses non-conventional infrared part of the spectrum in addition to visible light, to reconstruct PPG waveforms. The relationship between PPG, PTT and BP on the other hand is widely-researched and well-established [
2,
13]. We discuss shortcomings of existing methods in greater detail in Section
2. A pilot study describing such a recording setup has been reported at BHI 2022 (Slapničar et al., citation pending upon publication), however, this article focuses on entirely different methodology validated in different experiments and does not overlap.
The rest of the article is organized as follows: in Section
2 we give a detailed overview of related work, focusing especially on remote and MW sensing while also looking at state-of-the-art BP estimation from PPG; we then explain the physiological background and reasoning behind MW PTT and its relationship with BP, together with the camera physics and the model of our color channels in Section
3; in Section
4 we detail our recording setup and describe the data collection protocol; in Section
5 we describe our data processing pipeline and correlation analysis; and in Section
6 we discuss our findings alongside the limitations, while also proposing some future work. We conclude our article with a summary in Section
7.
6 Results and Discussion
Our initial hypothesis based on related work [
10] and physiological background described in Section
3 was that the cardiac wave from the deeper vessels in the NIR range should be observed first, while those in the shallower layers like papillary dermis should be delayed. We define these times as
\(PTT = timestamp\_steepest_{\text{green}} - timestamp\_steepest_{\text{NIR}}\), and we expect them to be positive, if in agreement with our hypothesis. Since we have a single ground-truth SBP and DBP for many cardiac cycles in a given 30-sec recording, we computed a single PTT that corresponds to one recording, as the average of all the detected PTTs. These PTTs between NIR and green for subjects in our experiment are shown in Figure
4.
First we can observe that the PTTs are indeed consistently positive, thus in agreement with our hypothesis and related work [
10]. Looking at the
y axis, we can discern that the average PTT is also longer in the rest scenario, as the HR and BP are lower, SVR is higher, and blood traverses more slowly. Furthermore, these PTTs are in the expected range of 50–150 ms [
4] and noticeably vary between subjects, which was not the case before applying channel separation. This confirmed that channel separation is important, both at the physical sensor level (narrow-band triple band-pass filter) as well as the signal processing level. Looking at subjects 7 and 8 – our outliers in terms of age and cardiovascular state – we can observe overall lower PTTs compared to all other subjects. This is again in agreement with our expectations of them having stiffer vessels, facilitating faster blood traversal and lower PTTs.
The second thing to validate are the effects of refined algorithmic channel separation. The average PTT between NIR and green across all subjects increases by 38.8 ms on average compared to average durations before applying our algorithm. Additionally, under the hypothesis of NIR steepest point appearing before the reference steepest point in green, the amount of cases in agreement increases by 19.3%. These statistics are summarized in Table
1.
The positive effect of channel separation is an increase in the PTTs on average, making them more easily observable and measurable. This effect can be observed when looking at the PPG waveforms in close detail (e.g., a couple of cycles), as shown in Figure
5. The effect is especially prominent between the NIR and green waveforms, while the blue waveform has some morphological irregularities that make it challenging to extract meaningful reference points and physiological information. This is because the blue light with shorter wavelength has shallower penetration into skin tissue and the epidermis does not exhibit blood perfusion due to lack of vascular presence.
We then checked the general correlations between PTTs and BP on all our recorded samples across all subjects, which can be inspected in the scatter plot shown in Figure
6. We also looked at the subject-specific correlations between PTTs and BP for both scenarios, however, even though these match in direction and value to those observed on all samples, they are statistically less significant due to the low number of samples.
In both cases, we can observe a clear separation between the two groups of measurements – red dots in the upper left corner of each plot correspond to higher BP and shorther PTTs, while blue dots in the bottom right corners correspond to lower BP and longer PTTs. The regression lines and correlation coefficients, show relatively high values of absolute correlations (R) between PTTs and BP, showing that these two biomarkers are indeed well-correlated, even when PTT is obtained in such a novel remote MW manner. If we consider the average correlation across all subjects in Figure
6, we find higher correlations for SBP compared to DBP (0.61 vs. 0.54), which might be attributed to the fact that DBP is generally more consistent and changes less than SBP between rest and activity. For these correlations we obtained the correspoding p-values of 0.08/0.05 for SBP/DBP.
Finally the predictive performance of our trained regression model in terms of MAE across all subjects is shown in Table
2.
We observed good stability between subjects, with the average MAEs of 3.59 mmHg and 2.63 mmHg, which surpasses the baseline that always predicts the mean by 15.61 and 9.73 mmHg for SBP and DBP respectively. The systolic error is higher, which could be attributed to larger variations in systolic pressure, while the diastolic generally changes less. We also observe a slight degradation of performance with subjects 7 and 8, which differ from others in terms of age and subsequently also their cardiovascular state (stiffer vessels). Investigating the feature importances, we found that the PTT feature dominates in importance compared to all others, achieving mean decrease in impurity (defined as the total decrease in node impurity, weighted by the probability of reaching that node, averaged over all trees of the ensemble [
1]) of 0.68, while the second largest was only 0.05.
A comparison with related work is difficult as the approaches are fundamentally different (remote vs. contact, single-site vs. multi-site, PTT vs. PPG morphology), the models are often neural networks [
5] that require more data and the experimental setup also differs. Despite this, our experiments confirm importance and robustness of PTT compared to morphological waveform features, show the value of personalization [
24] while also highlighting that in case of PTT, very little personalization is needed for good performance, as using only 2 instances decreases the error on average by 3.52 and 2.41 mmHg for SBP and DBP respectively. Another important yet often overlooked aspect when dealing with black-box models (e.g., neural networks) is the lack of explainability or basis on known physical and physiological laws. While such models can yield lower errors, the question of overfitting to specific datasets and the reasoning for predictions is often ignored, while the PTT relationship to BP is well-established in literature [
2].
7 Conclusion and Limitations
A synthesis of our results brings us to the conclusion that single-site remote MW BP estimation with a modified consumer camera is feasible and could offer an alternative to current cuff-based or optical contact approaches. We list the main contribution alongside important limitations in the following sections.
7.1 Contributions
The main contributions are as follows:
–
Remote MW PTT estimation is feasible and an improvement over traditional multi-site PTT monitoring, since it uses just a single site and sensor, omitting the need for precise synchronization or high-resolution cameras (e.g., when palm and forehead are captured on the same frame, which must have enough pixels of each to reconstruct the rPPG waveform). More importantly, it can circumvent problems with MW contact sensors, which were reported to influence the waveforms via skin compression [
21].
–
A pitfall of other work that should always be considered is the camera-specific inter-channel influence. This should be resolved both at the physical level (narrow-band filter) as well as signal processing level (the proposed channel separation method), as not accounting for this influence might lead to difficultly in detecting MW PTTs or incorrect conclusions based on their durations. Furthermore, using bands far-apart (e.g., green and NIR) produces longer and more easily observable PTTs.
–
A relatively simple regression model can achieve decent performance with relatively low MAEs for both SBP and DBP, also confirming the importance of the PTT feature compared to other commonly used morphological PPG features. Additionally, only a couple of instances used for personalization improve the performance compared to a general model.
We can also conclude, in agreement with related work [
21], that using the NIR band is preferred in MW PTT monitoring compared to traditional visible spectrum, since NIR light reaches deeper skin layers with arterioles, exhibiting better pulsatility compared to shallow layers reachable by blue wavelengths, while also increasing the length of the blood traversal path and delay between the waveforms. Feasibility of the proposed method was confirmed in controlled laboratory conditions, showing that usage of NIR and green wavelengths allows for acquisition of stable rPPG waveforms suitable for careful signal processing [
18,
21] and subsequent PTT calculation. Despite this, some challenges remain before practical application, which are listed and discussed in detail in the following section.
7.2 Limitations
We acknowledge that a non-clinical experiment with 9 subjects cannot be used to make statistically significant general conclusions. A large-scale clinical validation would be required instead, but this requires access to more (varied) subjects and prolonged observations with continuous ground-truth BP. The former is always preferred to a cuff, as it offers much richer BP information, but is measured using an intravenous catheter, which can only be feasibly used with trained medical personnel. Such a large-scale clinical validation (in a hospital) is exceptionally challenging to organize and makes more sense for subsequent steps beyond a feasibility study. Ideally, such an extensive evaluation would include hypotensive and hypertensive patients, both with mild (no medication) and more severe (using medication) cases.
We also made some simplifications in our work when doing channel separation – we approximated some signal contributions which might be refined further. Additionally, channel separation might be inaccurate when the light source spectrum does not cover the relevant spectral bands we are interested in. In future work, we will consider strictly data-driven algorithms for channel separation that are independent of the quantum efficiency and light spectrum information, for instance using a genetic algorithm where the channels are initially a random linear combination of all three colors and the coefficients corresponding to each color are iteratively derived by using the error of a BP regressor as the fitness function. Another option would be to use a projection method, generating a densely populated space of candidate coefficients and try to find the best performing ones.
Furthermore, we were quite strict with removal of noisy data, since our interest was in fundamental feasibility rather than practical application. Signal noise (e.g., movement) still remains a challenge when considering wide-spread use. During our lab experiments, we circumvented this by ensuring a stable sitting position and fixed arm position immediately after the exercise finished. This combined with the processing pipeline allowed for very robust detection of reference points, however, the method used to detect those should always be verified for performance and stability when dealing with noisier waveforms, as it is vital for PTT computation. In practical BP screening applications it could be partially resolved with a similar setup using a foam padding on which a subject places their hand. The setup would also consist of an enclosure ensuring no interference of ambient lighting while providing a light source with the desired spectrum. Generally, sunlight has a very broad uniform spectrum containing all the bands we are interested in, so it could be used for our proposed method, however, using common LED bulbs is not suitable due to their poor spectrum lacking NIR and IR part. Still, an LED array containing diodes that also emit longer-wavelength light is feasible and relatively inexpensive to construct [
12]. Such a setup could be used in a doctor’s office, especially when screening many subjects (e.g., groups of school children). Even more feasible (due to required personalization) is home use for telemedicine (e.g., elderly people), as self-measurement of BP with a cuff-based device (most common in general population) is very impractical and cumbersome, potentially causing incorrect measurements or even refusal to do regular measurement.
Finally, as we decided to use the palm as the measurement site due to tissue properties, our work is palm-specific to a degree, although we expect these findings to hold for other well-perfused skin locations. This remains to be validated, as variations in both skin tissue (e.g., melanin content, thickness) and measurement location should be investigated. However, as our purpose was to validate the early feasibility of remote MW PTT measurement and subsequent BP estimation, we limited our prototype setup and scenarios to be robust and reproducible at the cost of variations in measurement sites.
We used a relatively simple regression model to check the feasibility of explicit BP prediction and get a baseline result to compare against related work. The evaluation framework was designed to use personalized calibration, meaning some small part of data of each subject was used in the training in order to personalize the regression model. While this is not ideal, the consensus in the community is that building a robust generalized model is not feasible due to subject-specific relationship between PPG features and BP [
24], meaning that personalization or calibration is required (or at least desired) in all models for BP estimation, including those based on multi-site PTT [
14]. More specifically, there can be subject specifics in many biological properties, such as blood thickness, arterial wall thickness, elasticity, and other parameters related to PPG and BP. While early results seem promising, this evaluation is still in early stages and will be extended with more sophisticated models, additional features and other robust evaluation schemes.
Despite these limitations and further research opportunities, this work showed the importance of a fundamental understanding of both human physiology and camera physics for contact-free single-site MW measurement, while highlighting and addressing important challenges using novel methods. We showed that it is feasible to estimate PTT in such a way and confirmed that it highly correlates with BP, offering potential for remote MW BP estimation, which could be a useful part of a potential telemedicine system, as one of the parts in the growing framework of AIoT.