Conventional QR decomposition (QRD) hardware with a large size of channel matrix suffers from very low throughput and large latencies. This paper presents a high speed multi-dimensional (M-D) coordinate rotation digital computer (CORDIC) based QRD architecture. The novel high speed M-D architecture is enabled by exploiting multiple annihilations in a single CORDIC operation and removing data dependencies between two CORDIC operations (evaluation and application CORDIC) in Householder-based QRD process. The proposed QRD architecture can compute 4×4 complex R matrix for every 8 clock cycles. Our QRD hardware for 4×4 channel matrix was implemented using Samsung 0.13μm CMOS process, and the experimental results show that the proposed architecture achieves 4.74x speed-up compared to the conventional hybrid M-D based QRD.