Training apps and platforms

Evidence: limited

The one well-evidenced effect is a modest, often fading nudge to do more and stay consistent, and much of that evidence is about getting inactive people moving rather than improving an established runner. The parts the apps sell hardest are the weak parts: training-load scores and running power are model estimates, not measurements, and the adaptive AI plans have no independent validation. They earn their place for logging, prompts and a little social accountability, but an app running a generic algorithm over self-tracked metrics is no substitute for a coach’s individualised feedback.

Running apps fall into a few categories. Activity tracking with a social layer (Strava) records runs and adds segments and ‘kudos’. Training-load analytics (TrainingPeaks, intervals.icu) turn sessions into stress and fitness numbers. Adaptive plan apps (Runna, Nike Run Club, adidas Running, Coopah) generate and adjust training schedules, and the device ecosystems build the same thing in: Garmin Connect pairs tracking and analysis with its own adaptive coaching through Garmin Coach and Daily Suggested Workouts. Running-power tools (Stryd) add a real-time intensity number.

What the evidence supports

The behaviour-change case is genuine but modest. Smartphone apps on their own produce a small and often non-significant rise in daily activity that is largest in the first one to three months and dwindles after that (Romeo et al. 2019). Combine an app with a tracker and the effect is more solid, around 1,850 extra steps a day at roughly three months, with feedback, self-monitoring and goal-setting the active ingredients and personalisation making it work better (Laranjo et al. 2021). Wearable trackers raise steps and moderate-to-vigorous activity too, though at low certainty of evidence, and they do more when paired with coaching or support than alone (Brickwood et al. 2019).

The social and gamified features add their own nudge. Points, badges, challenges and leaderboards produce a small-to-moderate activity increase that partly persists once the programme ends but weakens substantially (Mazéas et al. 2022). On Strava specifically, receiving kudos is associated with running more and more often, and runners’ mileage drifts toward that of the peers they interact with, although the design is observational rather than causal (Franken et al. 2023).

Where the numbers mislead

Training-load metrics are models, not measurements. Training Stress Score and the fitness, fatigue and form lines of the Performance Management Chart are built on the Banister impulse-response framework and fixed time constants, so equal scores from easy volume and from hard intervals do not represent equal biological strain (TrainingPeaks). They are useful for tracking your own trend, not for comparison between runners or as a literal measure of stress.

Running power has the same problem in sharper form: there is no agreed reference for it, so absolute accuracy cannot be judged and the brands disagree with each other. The Stryd foot-pod is the most repeatable device and tracks intensity well, but it underestimates absolute power (Cerezuela-Espejo et al. 2021; García-Pinillos et al. 2020). Treat it as a personal, device-specific intensity proxy, the same caution that applies to the estimates on a GPS watch.

The adaptive AI training plans are the least supported of all. No independent peer-reviewed study tests whether the adaptive algorithms behind Runna, Nike Run Club, Coopah, adidas Running or Garmin’s Daily Suggested Workouts outperform a static plan or each other; the claims of science-backed personalisation are vendor self-report. The plans encode reasonable mainstream principles, but the ‘adaptive’ intelligence is unproven.

Using them well

The features that help are the ones that support consistency: logging, a visible streak, a goal, and a bit of social accountability. The traps are chasing the numbers and the leaderboards. Segment and record competition has driven documented real-world risk-taking (Flint v. Strava), and metric obsession can pull easy days too hard and turn a training tool into a source of anxiety. Let an app track and prompt; do not let it overrule how the run actually felt or sound the science it cannot back. Privacy is worth a thought too, since public segments and activity maps expose where and when you run.