Facial landmark localization is a fundamental module for face recognition. Current common approach for facial landmark detection is cascaded regression, which is composed by two steps: feature extraction and facial shape regression. Recent methods employ deep convolutional networks to extract robust features in each step and the whole system could be regarded as a deep cascaded regression architecture. Unfortunately, this architecture is problematic. First, parameters in the networks are optimized from a greedy stage-wise perspective. Second, the network cannot efficiently merge landmark coordinate vectors with 2D convolutional layers. Third, the facial shape regression relies on a feature vector generated from the bottom layer of the convolutional neural network, which has recently been criticized for lacking spatial resolution to accomplish pixel-wise localization tasks. We propose a globally optimized dual-pathway system (GoDP) to handle the optimization and precision weaknesses of deep cascaded regression without resorting to high-level inference models or complex stacked architecture. This end-to-end system relies on distance-aware softmax functions and dual-pathway proposal-refinement architecture. The proposed system outperforms the state-of-the-art cascaded regression-based methods on multiple in-the-wild face alignment databases. Experiments on face identification demonstrate that GoDP significantly improves the quality of face frontalization in face recognition.