In this thesis, we propose the 2.5D pose guided human image generation method that integrates depth information with 2D poses. Basically, given a specic 2.5D pose and an image of a person, our model can generate a new image of that person with a target pose. In order to incorporate depth information into the pose structure, we present the Three-Layer Pose Space that allows more accurate pose transfer compared with regular 2D pose structure. Specically, our pose space enables the generative model to solve the occlusion problems commonly happened in human image generation and helps us easily recognize spatial front back relations of limbs. Our approach is trained end-to-end on images and the corresponding 3D coordinates. In qualitative experiments on the DeepFashion, Human 3.6M dataset, our model demonstrates signicant improvement of visual eect regarding the depth of eld.