The robust PCA problem, wherein, given an input data matrix that is the superposition of a low-rank matrix and a sparse matrix, we aim to separate out the low-rank and sparse components, is a well-studied problem in machine learning. One natural question that arises is that, as in the inductive setting, if features are provided as input as well, can we hope to do better? Answering this in the affirmative, the main goal of this paper is to study the robust PCA problem while incorporating feature information. In contrast to previous works in which recovery guarantees are based on the convex relaxation of the problem, we propose a simple iterative algorithm based on hard-thresholding of appropriate residuals. Under weaker assumptions than previous works, we prove the global convergence of our iterative procedure; moreover, it admits a much faster convergence rate and lesser computational complexity per iteration. In practice, through systematic synthetic and real data simulations, we confirm our theoretical findings regarding improvements obtained by using feature information.