Accurate segmentation of anatomical structures in medical images is very important in neuroscience studies. Recently, multi-atlas patch-based label fusion methods have achieved many successes, which generally represent each target patch from an atlas patch dictionary in the image domain and then predict the latent label by directly applying the estimated representation coefficients in the label domain. However, due to the large gap between these two domains, the estimated representation coefficients in the image domain may not stay optimal for the label fusion. To overcome this dilemma, we propose a novel label fusion framework to make the weighting coefficients eventually to be optimal for the label fusion by progressively constructing a dynamic dictionary in a layer-by-layer manner, where a sequence of intermediate patch dictionaries gradually encode the transition from the patch representation coefficients in image domain to the optimal weights for label fusion. Our proposed framework is general to augment the label fusion performance of the current state-of-the-art methods. In our experiments, we apply our proposed method to hippocampus segmentation on ADNI dataset and achieve more accurate labeling results, compared to the counterpart methods with single-layer dictionary.