Electrical network frequency (ENF) signals have common patterns that can be used as signatures for identifying recorded time and location of videos and sound. To enable cost-efficient, reliable and scalable location inference, we created a reference map of ENF signals representing hundreds of locations world wide - extracting real-world ENF signals from online multimedia streaming services (e.g., YouTube and Explore). Based on this reference map of ENF signals, we propose a novel side-channel attack that can identify the physical location of where a target video or sound was recorded or streamed from. Our attack does not require any expensive ENF signal receiver nor any software to be installed on a victim»s device - all we need is the recorded video or sound files to perform the attack and they are collected from world wide web. The evaluation results show that our attack can infer the intra-grid location of the recorded audio files with an accuracy of $76$% when those files are $5$ minutes or longer. We also showed that our proposed attack works well even when video and audio data are processed within a certain distortion range with audio codecs used in real VoIP applications.