We introduce LAMP: the Linear Additive Markov Process. Transitions in LAMP may be influenced by states visited in the distant history of the process, but unlike higher-order Markov processes, LAMP retains an efficient parametrization. LAMP also allows the specific dependence on history to be learned efficiently from data. We characterize some theoretical properties of LAMP, including its steady-state and mixing time. We then give an algorithm based on alternating minimization to learn LAMP models from data. Finally, we perform a series of real-world experiments to show that LAMP is more powerful than first-order Markov processes, and even holds its own against deep sequential models (LSTMs) with a negligible increase in parameter complexity.