Tracking human poses in video can be considered as to infer the information of body joints. Among various obstacles to the task, the situation that a body-part occludes another, called 'self-occlusion,' is considered one of the most challenging problems. In order to tackle this problem, it is required for a model to represent the state of self-occlusion and to efficiently compute inference, complex with a depth order among body-parts. In this paper, we propose an adaptive self-occlusion reasoning method. A Markov random field is used to represent occlusion relationship among human body parts with occlusion state variable, which represents the depth order. In order to resolve the computational complexity, inference is divided into two steps: a body pose inference step and a depth order inference step. From our experiments with the HumanEva dataset we demonstrate that the proposed method can successfully track various human body poses in an image sequence.