Human Pose Estimation task is extremely challenging with event-driven cameras. We have been able to do it with an ANN approach but we are curious how would we go about approaching with an SNN. It's a 13 point skeleton regression problem with 2D pose from a single camara, in the camera reference frame. We have the popular Human 3.6m video dataset's event converted version in good quality (spatial and temporal resolution) and DHP19, in the same format.
This can be either an implementation or a discussion about possibilities.