Cystocele is a common disease in woman. Accurate assessment of cystocele severity is very important for treatment options. The transperineal ultrasound (US) has recently emerged as an alternative tool for cystocele grading. The cystocele severity is usually evaluated with the manual measurement of the maximal descent of the bladder (MDB) relative to the symphysis pubis (SP) during Valsalva maneuver. However,this process is time-consuming and operator-dependent. In this study,we propose an automatic scheme for csystocele grading from transperineal US video. A two-layer spatio-temporal regression model is proposed to identify the middle axis and lower tip of the SP,and segment the bladder,which are essential tasks for the measurement of the MDB. Both appearance and context features are extracted in the spatio-temporal domain to help the anatomy detection. Experimental results on 85 transperineal US videos show that our method significantly outperforms the state-of-theart regression method.