In this paper, an area based method to estimate the position of unidentified moving objects by fusion of heterogeneous sensor data collected from a distributed acoustic sensor network is proposed. The surveillance region considered is composed of a couple of transmitters and multiple binary sensors which are assumed to be located in lattice formation. Each binary sensor may only determine whether or not an object was detected and the time difference of arrival (TDOA) between transmitting signals and object reflected signals. The proposed method estimates the candidate regions in which the object may be located from these two types of data and progressively fuses the regions into a single common region. Then with this common candidate region, the position of the object is estimated by Maximum Likelihood Estimation (MLE). The relevant experimental results demonstrate that the performance and effectiveness of the proposed method are superior compared with the conventional approaches.