In recent years the problem of scene classification in remote sensing has attracted a considerable amount of attention. Solution for this important problem based on deep convolutional neural networks (CNN) are currently state-of-the-art. So far all CNNs used images of fixed size (typically 224× 224 which commonly used in other fields of computer vision). In this paper, we propose a multi-scale deep CNN architecture that can accept variable image sizes. We achieve this by using multiple CNN, that share some or all parameters, followed by a merge layer, fully connected layers, and finally a softmax layer for classification. In each epoch we train the network with a batch of images of all scales. We have implemented this architecture using three SqueezeNet CNNs trained on three different scales of scene images. Then we carried out experiments on three well know datasets, namely UC Merced, KSA, and AID datasets. Preliminary results show that this multi-scale CNN do converge just as the traditional single-scale training, and leads to better testing accuracy.