Person re-identification is generally divided into two part: first how to represent a pedestrian by discriminative visual descriptors and second how to compare them by suitable distance metrics. Conventional methods isolate these two parts, the first part usually unsupervised and the second part supervised. The Bag-of-Words (BoW) model is a widely used image representing descriptor in part one. Its codebook is simply generated by clustering visual features in Euclidian space. In this paper, we propose to use part two metric learning techniques in the codebook generation phase of BoW. In particular, the proposed codebook is clustered under Mahalanobis distance which is learned supervised. Extensive experiments prove that our proposed method is effective. With several low level features extracted on superpixel and fused together, our method outperforms state-of-the-art on person re-identification benchmarks including VIPeR, PRID450S, and Market1501.