Senior researcher Digital Therapeutics Research Center
Learning Objectives:
Describe the architecture and fundamental principles of an encoder-decoder convolutional neural network designed for predicting human joint positions from single RGB images.
Explain how depth (z) information can be encoded into 2D heatmap intensity and how a reverse-Gaussian deconvolution method can be utilized to retrieve joint depth from these heatmaps.
Evaluate the performance metrics (e.g., MSE, MAE, R², PCK@10, z-coordinate MAE) used to assess the accuracy of 2D joint position and reconstructed joint depth in depth-aware pose estimation from monocular images.