In this paper, we describe a video playing system, named "Empatheater," that is controlled by multimodal interaction. As the video is played, the user must interact and emulate predefined video "events" through multimodal guidance and whole body interaction (e.g. following the main character's motion or gestures). Without the timely interaction, the video stops. The system shows guidance information as how to properly react and continue the video playing. The purpose of such a system is to provide indirect experience (of the given video content) by eliciting the user to mimic and empathize with the main character. The user is given the illusion (suspended disbelief) of playing an active role in the unraveling video content. We discuss various features of the newly proposed interactive medium. In addition, we report on the results of the pilot study that was carried out to evaluate its user experience compared to passive video viewing and keyboard based video control.