When controlling robot in-situ, the operator's attention is often divided between the scene and the interface. This often causes inefficiency in the control performance. One possible solution to address this attention switch is to employ a camera (or sensor) view (despite being at the site) in which critical parts of the operating environment can be shown side-by-side with the control interface so that the user is not distracted from the either. In addition, when the user switches one's attention away unavoidably and then back to the control interface, the interface can be configured so that user can easily continue the task at hand without momentary the loss of context. In this paper, we describe the design of such an interface and investigate in the possible user attentive behaviors based on it. In particular, we present an experiment that compares three variant forms of interactions: (1) Nominal (no camera view), (2) Fixed (using a camera view and user not allowed to overlook into the scene), and (3) Free (using a camera view but user is free to overlook into the scene). The three approaches represent different balances between information availability, interface accessibility and the amount of attentional shift. Experiment results have shown that all three interaction models exhibited similar task performance even though the Fixed type induced much less attentional shift. However, the users much preferred the Nominal and Free type. Users mostly ignored the camera view, shifting one's attention excessively into the operating scene, due to the lack of visual quality, realistic scale and depth information of the camera view.