Re #3, that’s an astute question. You’re right: Larger gradients on earlier timestamps is a sufficient but not necessary precondition for capturing long term dependencies. It is possible that the early timestamps just aren’t important to a model, even one that can capture long term dependencies. For example, it may be possible to determine risk for sleep apnea after just 3 night’s sleep (most at-home sleep apnea tests require 1–3 nights of use), and earlier days are unnecessary. That said, this is unlikely to be the case for our other two prediction tasks (diabetes and hypertension).
And yes, the initial model parameters were randomized.
Re #4: Yes, that’s right. In this model, our input is one week of heart rate and step count data.