Estimation of sparse directed acyclic graphs for multivariate counts data

Sung Won Han, Hua Zhong

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

The next-generation sequencing data, called high-throughput sequencing data, are recorded as count data, which are generally far from normal distribution. Under the assumption that the count data follow the Poisson log-normal distribution, this article provides an L1-penalized likelihood framework and an efficient search algorithm to estimate the structure of sparse directed acyclic graphs (DAGs) for multivariate counts data. In searching for the solution, we use iterative optimization procedures to estimate the adjacency matrix and the variance matrix of the latent variables. The simulation result shows that our proposed method outperforms the approach which assumes multivariate normal distributions, and the log-transformation approach. It also shows that the proposed method outperforms the rank-based PC method under sparse network or hub network structures. As a real data example, we demonstrate the efficiency of the proposed method in estimating the gene regulatory networks of the ovarian cancer study.

Original languageEnglish
Pages (from-to)791-803
Number of pages13
JournalBiometrics
Volume72
Issue number3
DOIs
Publication statusPublished - 2016 Sep 1
Externally publishedYes

Keywords

  • Bayesian network
  • Count data
  • Directed acyclic graph
  • Lasso estimation
  • Penalized likelihood estimation
  • Unknown variable ordering

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Estimation of sparse directed acyclic graphs for multivariate counts data'. Together they form a unique fingerprint.

  • Cite this