We analyzed the objects in view in 868 hours of egocentric videos from the BabyView dataset—from 31 children (5-36 months old). We used an object detection model to extract object categories from >3M frames—pipeline below.
And more about the BabyView Project here! babyview-project.github.io