In this paper, we study a joint spatial division multiplexing (JSDM) beamforming scheme which enables large-scale spatial multiplexing gains for massive MIMO downlink systems. In contrast to the conventional JSDM which employs a block diagonalization (BD) method as a pre-beamformer, we aim to maximize sum-rate by applying minimum-mean-squared error (MMSE) approaches when designing a pre-beamformer and a multi-user precoder sequentially. First, to suppress inter-group interference, we design the pre-beamformer which minimizes an upper bound of the sum mean-squared-error in the large-scale array regime. Then, to mitigate same-group interference, we present the multi-user precoder based on the weighted MMSE (WMMSE) optimization method, which requires the same channel state information overhead as the conventional JSDM. Through simulation results, we confirm that the proposed two-step beamforming method brings substantial performance gains in terms of sum-rate over the conventional JSDM schemes especially in low to medium signal-to-noise ratio (SNR) regime.