Benchmark data set for in silico prediction of Ames mutagenicity

Katja Hansen, Sebastian Mika, Timon Schroeter, Andreas Sutter, Antonius Ter Laak, Steger Hartmann Thomas, Nikolaus Heinrich, Klaus Muller

Research output: Contribution to journalArticle

169 Citations (Scopus)

Abstract

Up to now, publicly available data sets to build and evaluate Ames mutagenicity prediction tools have been very limited in terms of size and chemical space covered. In this report we describe a new unique public Ames mutagenicity data set comprising about 6500 nonconfidential compounds (available as SMILES strings and SDF) together with their biological activity. Three commercial tools (DEREK, MultiCASE, and an off-the-shelf Bayesian machine learner in Pipeline Pilot) are compared with four noncommercial machine learning implementations (Support Vector Machines, Random Forests, k-Nearest Neighbors, and Gaussian Processes) on the new benchmark data set.

Original languageEnglish
Pages (from-to)2077-2081
Number of pages5
JournalJournal of Chemical Information and Modeling
Volume49
Issue number9
DOIs
Publication statusPublished - 2009 Sep 28
Externally publishedYes

ASJC Scopus subject areas

  • Chemistry(all)
  • Chemical Engineering(all)
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'Benchmark data set for in silico prediction of Ames mutagenicity'. Together they form a unique fingerprint.

  • Cite this

    Hansen, K., Mika, S., Schroeter, T., Sutter, A., Laak, A. T., Thomas, S. H., Heinrich, N., & Muller, K. (2009). Benchmark data set for in silico prediction of Ames mutagenicity. Journal of Chemical Information and Modeling, 49(9), 2077-2081. https://doi.org/10.1021/ci900161g