Multivariate stream data classification using simple text classifiers

Sungbo Seo, Jaewoo Kang, Dongwon Lee, Keun Ho Ryu

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    We introduce a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes as input a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a simple text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Naïve Bayes Model and SVM, and for unsupervised, we tested Jaccard, TFIDF, Jaro and Jaro Winkler. In our experiments, SVM and TFIDF outperformed the other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.

    Original languageEnglish
    Title of host publicationDatabase and Expert Systems Applications - 17th International Conference, DEXA 2006, Proceedings
    PublisherSpringer Verlag
    Pages420-429
    Number of pages10
    ISBN (Print)3540378715, 9783540378716
    DOIs
    Publication statusPublished - 2006
    Event17th International Conference on Database and Expert Systems Applications, DEXA 2006 - Krakow, Poland
    Duration: 2006 Sep 42006 Sep 8

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume4080 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Other

    Other17th International Conference on Database and Expert Systems Applications, DEXA 2006
    Country/TerritoryPoland
    CityKrakow
    Period06/9/406/9/8

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • Computer Science(all)

    Fingerprint

    Dive into the research topics of 'Multivariate stream data classification using simple text classifiers'. Together they form a unique fingerprint.

    Cite this