TY - GEN
T1 - Opening the black box
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2015
AU - Vidovic, Marina M.C.
AU - Görnitz, Nico
AU - Müller, Klaus Robert
AU - Rätsch, Gunnar
AU - Kloft, Marius
PY - 2015
Y1 - 2015
N2 - This work is in the context of kernel-based learning algorithms for sequence data. We present a probabilistic approach to automatically extract, from the output of such string-kernel-based learning algorithms, the subsequences—or motifs—truly underlying the machine’s predictions. The proposed framework views motifs as free parameters in a probabilistic model, which is solved through a global optimization approach. In contrast to prevalent approaches, the proposed method can discover even difficult, long motifs, and could be combined with any kernel-based learning algorithm that is based on an adequate sequence kernel. We show that, by using a discriminate kernel machine such as a support vector machine, the approach can reveal discriminative motifs underlying the kernel predictor. We demonstrate the efficacy of our approach through a series of experiments on synthetic and real data, including problems from handwritten digit recognition and a large-scale human splice site data set from the domain of computational biology.
AB - This work is in the context of kernel-based learning algorithms for sequence data. We present a probabilistic approach to automatically extract, from the output of such string-kernel-based learning algorithms, the subsequences—or motifs—truly underlying the machine’s predictions. The proposed framework views motifs as free parameters in a probabilistic model, which is solved through a global optimization approach. In contrast to prevalent approaches, the proposed method can discover even difficult, long motifs, and could be combined with any kernel-based learning algorithm that is based on an adequate sequence kernel. We show that, by using a discriminate kernel machine such as a support vector machine, the approach can reveal discriminative motifs underlying the kernel predictor. We demonstrate the efficacy of our approach through a series of experiments on synthetic and real data, including problems from handwritten digit recognition and a large-scale human splice site data set from the domain of computational biology.
UR - http://www.scopus.com/inward/record.url?scp=84959327515&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959327515&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-23525-7_9
DO - 10.1007/978-3-319-23525-7_9
M3 - Conference contribution
AN - SCOPUS:84959327515
SN - 9783319235240
SN - 9783319235240
SN - 9783319235240
SN - 9783319235240
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 137
EP - 153
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015
A2 - Costa, Vitor Santos
A2 - Soares, Carlos
A2 - Appice, Annalisa
A2 - Appice, Annalisa
A2 - Rodrigues, Pedro Pereira
A2 - Costa, Vitor Santos
A2 - Soares, Carlos
A2 - Gama, João
A2 - Jorge, Alípio
A2 - Rodrigues, Pedro Pereira
A2 - Gama, João
A2 - Costa, Vitor Santos
A2 - Jorge, Alípio
A2 - Appice, Annalisa
A2 - Rodrigues, Pedro Pereira
A2 - Gama, João
A2 - Appice, Annalisa
A2 - Soares, Carlos
A2 - Jorge, Alípio
A2 - Gama, João
A2 - Rodrigues, Pedro Pereira
A2 - Costa, Vitor Santos
A2 - Soares, Carlos
A2 - Jorge, Alípio
PB - Springer Verlag
Y2 - 7 September 2015 through 11 September 2015
ER -