In this project, we design a real-time human-computer interaction system based on hand gesture. The whole system consists of three components: hand detection, gesture recognition and human-computer interaction (HCI) based on recognition; and realizes the robust control of mouse and keyboard events with a higher accuracy of gesture recognition. Specifically, we use the convolutional neural network (CNN) to recognize gestures and makes it attainable to identify relatively complex gestures using only one cheap monocular camera. We introduce the Kalman filter to estimate the hand position based on which the mouse cursor control is realized in a stable and smooth way. During the HCI stage, we develop a simple strategy to avoid the false recognition caused by noises - mostly transient, false gestures, and thus to improve the reliability of interaction. The developed system is highly extendable and can be used in human-robotic or other human-machine interaction scenarios with more complex command formats rather than just mouse and keyboard events.