It has been recently shown that neural networks can recover the geometric structure of a face from a single given image. A common denominator of most existing face geometry reconstruction methods is the restriction of the solution space to some low-dimensional subspace. While such a model significantly simplifies the reconstruction problem, it is inherently limited in its expressiveness. As an alternative, we propose an Image-to-Image translation network that maps the input image to a depth image and a facial correspondence map. This explicit pixel-based mapping can then be utilized to provide high quality reconstructions of diverse faces under extreme expressions. In the spirit of recent approaches, the network is trained only with synthetic data, and is then evaluated on “in-the-wild” facial images. Both qualitative and quantitative analyses demonstrate the accuracy and the robustness of our approach. As an additional analysis of the proposed network, we show that it can be used as a geometric constraint for facial image translation tasks.