A camera-based perception of dangerous road situations such as traffic accidents is a significant task in modern autonomous driving and ADAS. The previous approaches have scrutinized a spatio-temporal characteristics of the traffic accident in a sequence of images. However, we figured out the limit of past works that the aforementioned spatio-temporal pattern is only considered in 2D manner, which loses a contextual knowledge of the road situation in 3D space where the accident actually happens. In this study, we propose a novel approach to learn a spatio-temporal pattern of traffic accidents in a sequence of traffic scene images. First, we designed a spatial feature extractor that illustrates the distance among traffic objects in a 3D manner, which contextually describes the road situation better by considering traffic objects' location with their depth information. Second, we proposed an accident detection model and examined the model identified traffic accidents with 0.8560 accuracy and a 0.9080 F1 score. Lastly, we suggested an accident anticipation model, and it outperformed the previously-proposed benchmark anticipation model in a challenging task. We expect further improvement of our approach can contribute to the safe vehicular technology for autonomous driving and ADAS development.