Heart rate does not accurately predict metabolic intensity during variable intensity roller-skiing or cycling

3 Purpose: To examine the utility of heart rate (HR) and power output (PO) to predict metabolic 4 rate (MR) and oxygen consumption (V̇O 2 ) during variable intensity roller-skiing and cycling. 5 Methods: National-level cyclists (n=8) and cross-country skiers (n=9) completed a preliminary 6 session to determine V̇O 2max , and a variable intensity protocol with three high-intensity (HI) 7 stages at 90% V̇O 2max for 3-min interspersed with three moderate-intensity (MI) stages at 70% 8 V̇O 2max for 6-min. Cardiorespiratory measures were recorded throughout. Linear regressions 9 between MR and V̇O 2 with HR and PO were computed from the preliminary session for all 10 athletes and used to predict MR and V̇O 2 from both HR and PO, separately, during the variable 11 intensity protocol. Mean differences with 95% limits of agreement (LOA) between measured 12 and predicted MR and V̇O 2 during the variable intensity protocol were calculated. Results: MR 13 and V̇O 2 estimated from HR displayed an overall mean bias close to 0 but wide LOA. HR 14 overestimated MR and V̇O 2 during MI but underestimated MR and V̇O 2 during HI, for both 15 roller-skiing and cycling. MR and V̇O 2 estimated from PO was more consistent across the time 16 of the experimental trial, displaying a mean bias further from 0 but with tighter LOA. 17 Conclusions: This study has demonstrated that HR has limited utility to predict metabolic 18 intensity during variable intensity roller-skiing and cycling because of wide LOAs. On the other


Introduction
The basis of heart rate (HR) as a measure of internal exercise intensity is rooted in the assumption of a linear relationship with oxygen consumption (VȮ 2 ) and metabolic rate (MR) during steady-state, sub-maximal intensity exercise, 1 where research has shown nearly perfect correlation coefficients (r = 0.99). 2 Therefore, HR monitoring has been promoted as a valid measure of internal exercise intensity during aerobic steady-state exercise, but not during intermittent activity or exercise which involves significant contributions from anaerobic energy systems. 3For example, previous research has reported that HR can provide an accurate means of MR and total energy expenditure prediction over extended durations in free-living conditions 4 and during continuous steady-state exercise. 5However, HR was poor at estimating VȮ 2 during intermittent exercise, where predicted VȮ 2 from HR underestimated VȮ 2 by as much as 10% VȮ 2max during competition and training in handball. 6Inaccuracies of up to 10% VȮ 2max could influence interpretation of data and exercise prescription.
Despite this limitation, HR monitoring might have utility to measure average internal exercise intensity because overestimation and underestimation of exercise intensity during intermittent activity likely evens out.This enables the comparison of the average relative exercise intensity between athletes with varying levels of physical capacities and might provide a means to quantify exercise volume (e.g., Banister's TRIMP). 7Information regarding average exercise intensity and exercise volume provide valuable information for coaches but has limited utility to tailor competition-replicating training programs to achieve optimal performance improvements.
All taken together, HR can provide a useful measure to predict internal exercise intensity during steady-state exercise.However, there is limited evidence for the validity of HR to measure intensity during intermittent exercise.Furthermore, cardiovascular drift means that HR increases gradually during prolonged exercise, probably as a result of a declining stroke volume. 8Accordingly, this further impacts the associations between HR, VȮ 2 , MR and external intensity, such as power output (PO).Despite these limitations, measurement of HR for monitoring exercise intensity remains common place in many intermittent and endurance sports. 9,10For example, researchers have proposed calculating time in HR training zones for monitoring elite endurance athletes, 11 with some of the proposed zones being as tight as just 4% HR max . 11,12Misclassification of the training 'zone' when monitoring the athlete might lead to errors in calculating training demands and lead to either over-or under-training and/or increased injury risk.
Considering that most competitive sports are intermittent it is of great importance to assess the accuracy of HR to predict VȮ 2 and MR in these conditions.Therefore, this study aimed to examine the utility of HR to predict VȮ 2 and MR during variable intensity roller-skiing and cycling.A secondary aim was the compare the utility of HR and PO to predict VȮ 2 and MR during the same conditions.

Design
The present study consisted of two separate testing sessions: 1) a preliminary testing session which involved an incremental exercise protocol for the determination of cardiorespiratory responses during submaximal and maximal exercise; and 2) a variable intensity experimental protocol with continuous recordings of cardiorespiratory measures, completed within 2 weeks of the preliminary test.
All testing was completed in laboratory-controlled conditions in a manner as described previously (XXXX_reference_intentionally_withheld_for_blind_review).Briefly, participants were asked to consume a controlled diet.Respiratory variables were measured using an ergospirometry system (AMIS 2001 model C, Innovision A/S, Odense, Denmark), based on the mixed expired method using an inspiratory flowmeter with a sampling frequency of 0.1 Hz.
All cycling was performed on an SRM high-performance bicycle ergometer (Schoberer Rad Messtechnik, Julich, Germany) PO during the cycling was calculated from the calibrated strain gauge fixed to the crank arms of the cycle ergometer.
All skiing was performed on roller-skis, using the diagonal stride technique on a motor-driven treadmill (Rodby RL 3000, Rodby Innovation AB, Vänge, Sweden).The PO during rollerskiing was calculated using an equation as previously described by Andersson et al. 14 Briefly, the sum of the power exerted to elevate the total mass (m tot ; body+equipment) against gravity (g) and to overcome rolling resistance (µ R = .022as previously described by Ainegren et al. 15 ).

Preliminary Test Cyclists
To establish VȮ 2max and the PO for the variable intensity protocol, cyclists performed an incremental test, which was continued to volitional fatigue.The incremental test was performed according to previously described methods Padilla et al. 16 with a modification of the start intensity to 85 W. Each stage of the test was 4-min long with 35 W increments interspersed with 1 min periods performed at 50 W. Cyclists were told to keep their cadence between 80-90 rev•min -1 throughout the test.The test was performed to exhaustion or terminated if the cadence fell below 70 rev•min -1 .Maximal PO attained was determined as previously described. 17

Skiers
Two preliminary tests were performed in order to establish maximum oxygen consumption (VȮ 2max ) and the speed required during the variable intensity experimental protocol.For the first test, the initial exercise stage was set at an incline of 3º and a treadmill velocity of 8.5 km•h -1 .Thereafter exercise intensity was increased by 1º and 0.5 km•h -1 at every 4-minute increment.Skiers completed five to seven submaximal increments interspersed with 1-minute rest for capillary blood lactate sampling.The mean of the three highest consecutive VȮ 2 values within the last minute of each submaximal increment was used to calculate the speed and inclination necessary to define the percentage of VȮ 2max during the variable intensity experimental test protocol.Following a 10-min rest period, an incremental protocol for determination of VȮ 2max was performed, starting at an incline of 4º with a progression of 1º each minute with the initial speed set between 10 km•h -1 and 11 km•h -1 .If participants were able to continue beyond an incline of 11º, speed was increased by 0.3 km•h -1 every 30 s.

Data Analysis
Data from the preliminary testing session is presented in Table 1.Metabolic rate was calculated using the Weir equation, as previously described 18 and expressed in Joules per second (i.e.,

Watts [W]
). Linear regressions between HR and MR, HR and VȮ 2 , PO and MR as well as PO and VȮ 2 were computed for all athletes for all exercise stages except the final two.The final two stages were excluded from the linear regressions in order to remove the plateau of physiological variables as athletes approached VȮ 2max , ensuring that only the linear portion of the relationships were included.The coefficient of determination for the regressions between HR and MR were nearly perfect for both skiers (R 2 = 0.99 ± 0.01) and for cyclists (R 2 = 0.98 ± 0.01).The coefficient of determination for the regressions between PO and MR were nearly perfect for both skiers (R 2 = 1.00 ± 0.00) and for cyclists (r 2 = 1.00 ± 0.00).The coefficient of determination for the regressions between HR and VȮ 2 were nearly perfect for both skiers (R 2 = 0.99 ± 0.00) and for cyclists (R 2 = 0.99 ± 0.00).The coefficient of determination for the regressions between PO and VȮ 2 for skiers (R 2 = 0.99 ± 0.00) and cyclists (R 2 = 0.99 ± 0.00) were also nearly perfect.Individual linear relationships were then used to predict MR and VȮ 2 from both HR and PO separately, permitting the comparison between measured MR and measured VȮ 2 to predicted MR and predicted VȮ 2 during the variable intensity exercise trial.

Experimental Test
The experimental test began with a 10-min warm up period at an intensity corresponding to 50% of VȮ 2max as determined during the preliminary test.Thereafter, the athletes performed a test protocol consisting of three high intensity (HI) stages corresponding with 90% of VȮ 2max for 3-min each interspersed by three moderate intensity (MI) stages corresponding with 70% VȮ 2max for 6-min each (Figure 1).Cardiorespiratory variables were measured throughout the variable intensity protocol as described above for the incremental test, with the mean during the final minute of each stage used for analyses.For skiers, the speed and incline during the test protocol were established from the preliminary submaximal test so as to correspond to the two relative exercise intensities (90% and 70% of VȮ 2max ).For Cyclists, the power outputs corresponding with 90% and 70% VȮ 2max were computed from the preliminary test from the that the assumption of normality was not violated and group data were expressed as mean ± standard deviation (SD).A repeated measure mixed model analysis of variance (within factor: intensity; between factor: exercise mode) was used to determine if exercise mode (cycle vs. skiing) influenced the pattern of physiological response to intensity (HI vs. MI) throughout the variable intensity protocol.For all ANOVAs, effect sizes are presented as partial eta-squared statistic (ƞ 2 p ). Significant interactions were followed up with simple main effect analyses with pairwise comparisons using Bonferroni correction.Further, mean differences between measured MR and predicted MR as well as measured %VȮ 2max and predicted %VȮ 2max from linear regressions with HR or PO during the experimental trial with 95% limits of agreement (LOA) were determined according to methods described by Bland and Altman. 19

Results
The power output and measured physiological variables during the experimental trial are displayed in Table 2.There was no interaction effect for any variable (F 3) MR and VȮ 2 estimated from PO was more consistent across the time of the experimental trial, displaying a mean bias further from 0 but with tighter LOA.
The HR-VȮ 2 relationship is commonly utilised in sports science for exercise prescription and monitoring.For example, HR was first proposed as a means of monitoring exercise intensity during steady-state submaximal exercise because of this relationship. 20,21The results from this study have confirmed that nearly perfect regressions exist between HR and MR/VȮ 2 during laboratory controlled sub-maximal incremental roller-skiing and cycling exercise (R 2 ≥ 0.98).
However, during the variable intensity exercise protocol there was variability in the difference between measured and predicted MR/VȮ 2 based on this relationship.This suggests that, although nearly perfect regressions exist in sub-maximal steady-state conditions, HR is less accurate at predicting MR and VȮ 2 during variable intensity exercise, which is common in many sports and competitions. 6,22,23results from the present study might suggest that average HR displays good validity but poor reliability to estimate MR and VȮ 2 during variable intensity cycling and roller-skiing.This is because the overall HR displayed a mean bias close to 0 for both MR and %VȮ 2max but with wide LOA.However, when investigating the mean bias and LOA for individual intensities and exercise modes (Table 3 and Table 4), HR tended to overestimate both MR and VȮ 2 during MI, but underestimate MR and VȮ 2 during HI.Accordingly, the under-and overestimation throughout the variable intensity exercise trial evened out, resulting in a mean bias close to zero.
The time-course change in the accuracy of HR to predict MR or VȮ 2 might be related to cardiovascular drift. 8Accordingly, during endurance exercise training or competitions (e.g., distance cycling or skiing), it is reasonable to expect that HR will become less accurate over time to predict MR and VȮ 2 .This provides further weighting to the notion that HR at any given point in time during variable intensity exercise does not necessarily reflect the corresponding metabolic intensity. 24 Quantifying the duration of exercise that athletes perform in various 'HR zones' is common practice. 11In addition, a common measure of so-called 'training load' (or exercise volume) is the summated HR zones model, also known as 'Edward's TRIMP', 25 which is the amalgamation of exercise duration and an intensity weighting factor based upon HR intensity zones.The results from the present study suggest that the error associated with HR means that with just a 4% HR max zone, the associated MR and VȮ 2 could be markedly different.This error is in addition to the known day-to-day variation in the HR responses to exercise, 21,26 as well as the error associated with the actual HR measuring device itself. 27,28Such differences, even small, could still have an impact on the prediction of MR and VȮ 2 and therefore the actual metabolic intensity of exercise.Further, hydration status and environmental factors, such as temperature and altitude can also influence HR. 21It is recognised that HR monitoring is undoubtedly a practical tool to provide an indication of exercise intensity.However, all taken together, it seems clear that HR is not necessarily able to reflect metabolic intensity with accuracy at any point in time during variable intensity exercise.Due to variability in the accuracy of HR to predict MR or VȮ 2 , practitioners are unable to be certain of the actual metabolic intensity associated with the HR response.This brings into question the utility of HR intensity 'zones' for exercise prescription and monitoring during variable intensity exercise.
Conversely, in the present study, PO had poor validity but good reliability to estimate MR and %VȮ 2max in the same conditions.In demonstration of this, both MR and %VȮ 2max predicted from PO displayed a mean bias further from 0, but with tighter LOA.Explanatory factors for the overestimated MR and %VȮ 2max during the three MI stages of cycling and roller-skiing might be related to an increased metabolic cost due to blood lactate clearance. 29Given that there were tight LOAs, a correction factor could possibly be used to allow a more valid estimate of VȮ 2 or MR from PO during roller-skiing and cycling.

Practical Application
The results from this study suggest that PO might provide a better prediction of metabolic intensity during variable intensity cycling and skiing compared to HR. Practitioners could calculate individual relationships between PO and MR/VȮ 2 during laboratory testing sessions.
Subsequently, these relationships can be used to predict metabolic intensity during training sessions, likely with greater accuracy compared with using a HR monitor.Accordingly, the training demands of elite-level cyclists and skiers can be better monitored compared with using HR alone.It should be considered that measuring PO from cycling is a common and simple task given modern power metres can be installed onto the crank. 30,31Although computing PO during rolling-skiing on a treadmill is also relatively simple, calculating PO during outdoor rolling-skiing or on-snow skiing is a complicated task given variations in snow and weather conditions, as well as variations in air resistance from day-to-day.However, recently some researchers have proposed novel methods of measuring propulsive power during outdoor rollerskiing 32 and on-snow skiing 33,34 were members of the highest-ranked team within their nation and skiers who were either currently or previously competing at the International Ski Federation world cup or the Olympics.All participants provided written informed consent and completed all requirements of the study.Data collected for this research was part of a larger project, some of which has been previously published (XXXX_references_intentionally_withheld_for_blind review).The regional ethical review board in (XXXX_intentionally_withheld) (registration number: XXXX_intentionally_withheld) preapproved the research techniques and experimental protocol.All research was conducted in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki).
linear regression equations.***Figure 1*** Statistical Analyses All statistical analyses were performed using IBM SPSS Statistics for Windows (Version 27.0;IBM Corporation, NY) with level of significance set at α < 0.05.Shapiro-Wilk tests confirmed This is potentially problematic for athletes who are following a training program where intensity is monitored based upon their HR.For example, at the beginning of a variable intensity training session, the actual metabolic intensity required to reach a given HR might be well above that intended.On the other hand, by the end of an extended training session, HR might severely misrepresent the metabolic intensity.Accordingly, this could lead to misclassification of the training 'zone' when monitoring the athlete causing errors in calculating training demands and lead to either over-or under-training and/or increased injury risk and underperformance.

Fig. 1 Fig. 2 Fig. 3
Fig. 1 Schematic of the research timeline including preliminary and experimental exercise trials.Numbers represent exercise intensity as a percentage of VO2max.108x60mm (300 x 300 DPI) 1,15 ≤ 3.303, p ≥ 0.089, η 2 p ≤ 0.180), suggesting that the physiological response was similar between skiing and cycling.As expected, PO, MR, VȮ 2 , RER, HR, and Bla were greater for HI compared to MI (p < 0.05).VȮ 2max (95% LOA: -12.1 to 11.3% VȮ 2max ).The mean bias for MR predicted from PO was -33 W (95% LOA: -163 to 98 W).The mean bias for VȮ 2 predicted from PO was -1.4 mL•kg•min -1 (95% LOA: -6.1 to 3.3 mL•kg•min -1 ) or -2.1 % VȮ 2max (95% LOA: -9.3 to 5.1% VȮ 2max ).VȮ 2 predicted from HR displayed a mean bias ranging from -5.9 -4.7% VȮ 2max often with wide LOA.VȮ 2 predicted from PO displayed a mean bias ranging from -4.8 -1.5% VȮ 2max with tighter LOA. Discussion This study aimed to examine the utility of HR to predict MR and VȮ 2 during variable intensity roller-skiing and cycling.A secondary aim was to compare the accuracy of HR and PO in the prediction of MR and VȮ 2 during the same exercise conditions.The main findings from this study were: 1) MR and VȮ 2 estimated from HR displayed an overall mean bias close to 0 but with wide LOA; 2) more specifically, HR tended to overestimate MR and VȮ 2 during MI exercise but underestimate MR and VȮ 2 during HI exercise, for both roller-skiing and cycling; ***Table 2*** The difference between measured and predicted MR and %VȮ 2max from linear relationships with HR and PO during the MI and HI exercise bouts for both cycling and skiing are displayed in Figure 2.For both cycling and roller-skiing, HR underestimated measured MR and %VȮ 2max in the first HI stage and overestimated measured MR and %VȮ 2max in the final MI stage.During cycling, predicted MR and %VȮ 2max from PO were consistently underestimated throughout the variable intensity trial, except for the first stage where MR was overestimated by just 0.6 ± 29 W. For roller-skiing, predicted MR from PO overestimated measured MR by 35 ± 76 W in the first HI stage and underestimated measured MR by 77 ± 66 W in the final MI stage, or 1.8 ± 2.3% VȮ 2max overestimated in the first HI stage and underestimated measured VȮ 2 by 4.7 ± 2.6% VȮ 2max .***Figure 2*** Figure 3 displays Bland and Altman plots demonstrating the agreement between measured MR and measured %VȮ 2max with predicted MR and predicted %VȮ 2max from linear relationships with HR and PO.The mean bias for MR predicted from HR was -4 W (95% LOA: -234 to 226 W).The mean bias for VȮ 2 predicted from HR was -0.2 mL•kg•min -1 (95% LOA: -7.8 to 7.3 mL•kg•min -1 ) or -0.4 % *** Table 4 here *** using inertial sensors.Future research should assess the utility of these models of measuring PO to predict MR and VȮ 2 during ecologically valid settings.On average, MR and %VȮ 2max predicted from HR had a low mean bias but wide LOA.More specifically, HR tended to overestimate both MR and %VȮ 2max during MI exercise but underestimate during HI exercise.As such, HR can provide a good estimate of the average metabolic exercise intensity during variable intensity exercise.However, this brings into question the utility of using HR and 'HR zones' for prescribing and monitoring exercise of variable intensity.Misclassification of the training 'zone' when monitoring the athlete causing errors in calculating training demands and lead to either over-or under-training and/or increased injury risk and underperformance.On the other hand, MR and %VȮ 2max predicted from PO had a mean bias further from 0, but with tighter LOA, suggesting better reliability to predict metabolic intensity.Schematic of the research timeline including preliminary and experimental exercise trials.Numbers represent exercise intensity as a percentage of VȮ 2max.Bland and Altman plots with 95% limits of agreement (LOA) for measured metabolic rate (top row) and measured %VȮ 2max (bottom row) and MR and %VȮ 2max predicted from heart rate (left) and power output (right).%VȮ 2max = Percentage of maximum oxygen consumption; MI = Moderate Intensity; HI = High Intensity.

Table 1 .
Oxygen consumption, heart rate, metabolic rate and power output for skiers and cyclists during the preliminary tests.Mean ± SD.HR: Heart rate; R 2 : Coefficient of determination; VȮ 2 : oxygen consumption.
Page 15 of 18 Human Kinetics, 1607 N Market St, Champaign, IL 61825 International Journal of Sports Physiology and Performance F o r P e e r R e v i e w

Table 2 .
Power output and physiological variables for cycle and ski exercise during high intensity (HI) and moderate intensity (MI) exercise bouts.Mean ± SD. * = Different to HI (p < 0.05).HI: High intensity; MI: Moderate Intensity; VȮ 2max : Maximum oxygen consumption; HR: Heart rate; RER: Respiratory exchange ratio.

Table 3 .
Mean bias and 95% limits of agreement (LOA) for metabolic rate predicted from heart rate and power output during cycle and ski exercise for the high intensity (HI) and moderate intensity (MI) exercise bouts.HI: High intensity; MI: Moderate Intensity; LOA: Limits of agreement; HR: Heart rate.

Table 4 .
Mean bias and 95% limits of agreement (LOA) for %VȮ 2max predicted from heart rate and power output during cycle and ski exercise for the high intensity (HI) and moderate intensity (MI) exercise bouts.HI: High intensity; MI: Moderate Intensity; LOA: Limits of agreement; HR: Heart rate