A threshold adaptation based voice query transcription scheme for music retrieval

Byeong Jun Han, Seungmin Rho, Een Jun Hwang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature extractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (FO) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.

Original languageEnglish
Pages (from-to)445-451
Number of pages7
JournalTransactions of the Korean Institute of Electrical Engineers
Volume59
Issue number2
Publication statusPublished - 2010 Feb 1

Fingerprint

Transcription
Information retrieval
Experiments
Detectors

Keywords

  • Audio signal analysis
  • Music transcription
  • Note onset detection
  • Query-by-humming

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

A threshold adaptation based voice query transcription scheme for music retrieval. / Han, Byeong Jun; Rho, Seungmin; Hwang, Een Jun.

In: Transactions of the Korean Institute of Electrical Engineers, Vol. 59, No. 2, 01.02.2010, p. 445-451.

Research output: Contribution to journalArticle

@article{89641fd43d1945e8a6c25a28aca80404,
title = "A threshold adaptation based voice query transcription scheme for music retrieval",
abstract = "This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature extractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (FO) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.",
keywords = "Audio signal analysis, Music transcription, Note onset detection, Query-by-humming",
author = "Han, {Byeong Jun} and Seungmin Rho and Hwang, {Een Jun}",
year = "2010",
month = "2",
day = "1",
language = "English",
volume = "59",
pages = "445--451",
journal = "Transactions of the Korean Institute of Electrical Engineers",
issn = "1975-8359",
publisher = "Korean Institute of Electrical Engineers",
number = "2",

}

TY - JOUR

T1 - A threshold adaptation based voice query transcription scheme for music retrieval

AU - Han, Byeong Jun

AU - Rho, Seungmin

AU - Hwang, Een Jun

PY - 2010/2/1

Y1 - 2010/2/1

N2 - This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature extractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (FO) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.

AB - This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature extractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (FO) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.

KW - Audio signal analysis

KW - Music transcription

KW - Note onset detection

KW - Query-by-humming

UR - http://www.scopus.com/inward/record.url?scp=77149148410&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77149148410&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:77149148410

VL - 59

SP - 445

EP - 451

JO - Transactions of the Korean Institute of Electrical Engineers

JF - Transactions of the Korean Institute of Electrical Engineers

SN - 1975-8359

IS - 2

ER -