This paper presents a method for assessing skill of performance from video, for a variety of tasks, ranging from drawing to surgery and rolling dough. We formulate the problem as pairwise and overall ranking of video collections, and propose a supervised deep ranking model to learn discriminative features between pairs of videos exhibiting different amounts of skill. We utilise a two-stream Temporal Segment Network to capture both the type and quality of motions and the evolving task state. Results demonstrate our method is applicable to a variety of tasks, with the percentage of correctly ordered pairs of videos ranging from 70% to 82% for four datasets. We demonstrate the robustness of our approach via sensitivity analysis of its parameters. We see this work as effort toward the automated and objective organisation of how-to videos and overall, generic skill determination in video.