It is a monocular iPhone video dataset from a fixed frontal view, including a variety of actions such as head rotation, brief expressions, and speech. In this dataset, we want to generate static frontal view videos with changes in pose and expression.