Sure, in production envs I have seen humans being used in 3 places:
1. the pose data calibration
2. cleaning up covariances (reducing blobbiness)
3. adding metadata for app usage
But, to your point, hard to say which of these or any apply without more info. I would be very very impressed if there were no humans and it's 'just' a training time issue though!