For a service robot to assist humans, it should interact with objects of varying sizes and shapes existing in an indoor environment. 3D object detection must be preceded to achieve this goal since it provides the robot with the ability to perceive visual information. Most of the existing methods are anchor-based and predict the bounding box close to the ground truth among multiple candidates. However, it is complex to compute Intersection over Union (IoU) and Non-Maximum Suppression (NMS) per each anchor box. Therefore, we propose keypoint-based monocular 3D object detection, where each object's center location is only needed for reproducing predicted 3D bounding box without extra computation of the anchor boxes. Our 3D object detection also works well even if images are rotated corresponding to the robot's head movement. To properly train our network, the object center is based on a projected 3D location instead of 2D to take advantage of the exact center position of the object. Furthermore, we apply data augmentation using a perspective transformation. The method facilitates adding a small perturbation to the camera rotation angle randomly. We use the SUN RGB-D dataset, which has images taken indoor scenes with camera rotations for training and test set. Our approach particularly shows that the errors of object center location based on a single image reduce 15.4% and 24.2%, respectively, compared to the method without data augmentation.