An efficient method for maintaining data cubes incrementally

Ki Yong Lee, Yon Dohn Chung, Myoung Ho Kim

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

The data cube operator computes group-bys for all possible combinations of a set of dimension attributes. Since computing a data cube typically incurs a considerable cost, the data cube is often precomputed and stored as materialized views in data warehouses. A materialized data cube needs to be updated when the source relations are changed. The incremental maintenance of a data cube is to compute and propagate only its changes, rather than recompute the entire data cube from scratch. For n dimension attributes, the data cube consists of 2n group-bys, each of which is called a cuboid. To incrementally maintain a data cube with 2n cuboids, the conventional methods compute 2n delta cuboids, each of which represents the change of a cuboid. In this paper, we propose an efficient incremental maintenance method that can maintain a data cube using only a subset of 2n delta cuboids. We formulate an optimization problem to find the optimal subset of 2n delta cuboids that minimizes the total maintenance cost, and propose a heuristic solution that allows us to maintain a data cube using only fenced((n; ⌈ n / 2 ⌉)) delta cuboids. As a result, the cost of maintaining a data cube is substantially reduced. Through various experiments, we show the performance advantages of the proposed method over the conventional methods. We also extend the proposed method to handle partially materialized cubes and dimension hierarchies.

Original languageEnglish
Pages (from-to)928-948
Number of pages21
JournalInformation Sciences
Volume180
Issue number6
DOIs
Publication statusPublished - 2010 Mar 15

Fingerprint

Data Cube
Cuboid
Costs
Data warehouses
Set theory
Maintenance
Data cube
Attribute
Experiments
Subset
Data Warehouse
Regular hexahedron
Entire
Heuristics

Keywords

  • Data cube
  • Data warehouse
  • Materialized view
  • OLAP

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management

Cite this

An efficient method for maintaining data cubes incrementally. / Lee, Ki Yong; Chung, Yon Dohn; Kim, Myoung Ho.

In: Information Sciences, Vol. 180, No. 6, 15.03.2010, p. 928-948.

Research output: Contribution to journalArticle

Lee, Ki Yong ; Chung, Yon Dohn ; Kim, Myoung Ho. / An efficient method for maintaining data cubes incrementally. In: Information Sciences. 2010 ; Vol. 180, No. 6. pp. 928-948.
@article{9575a5336f4a4d43902bed96eda9a16b,
title = "An efficient method for maintaining data cubes incrementally",
abstract = "The data cube operator computes group-bys for all possible combinations of a set of dimension attributes. Since computing a data cube typically incurs a considerable cost, the data cube is often precomputed and stored as materialized views in data warehouses. A materialized data cube needs to be updated when the source relations are changed. The incremental maintenance of a data cube is to compute and propagate only its changes, rather than recompute the entire data cube from scratch. For n dimension attributes, the data cube consists of 2n group-bys, each of which is called a cuboid. To incrementally maintain a data cube with 2n cuboids, the conventional methods compute 2n delta cuboids, each of which represents the change of a cuboid. In this paper, we propose an efficient incremental maintenance method that can maintain a data cube using only a subset of 2n delta cuboids. We formulate an optimization problem to find the optimal subset of 2n delta cuboids that minimizes the total maintenance cost, and propose a heuristic solution that allows us to maintain a data cube using only fenced((n; ⌈ n / 2 ⌉)) delta cuboids. As a result, the cost of maintaining a data cube is substantially reduced. Through various experiments, we show the performance advantages of the proposed method over the conventional methods. We also extend the proposed method to handle partially materialized cubes and dimension hierarchies.",
keywords = "Data cube, Data warehouse, Materialized view, OLAP",
author = "Lee, {Ki Yong} and Chung, {Yon Dohn} and Kim, {Myoung Ho}",
year = "2010",
month = "3",
day = "15",
doi = "10.1016/j.ins.2009.11.037",
language = "English",
volume = "180",
pages = "928--948",
journal = "Information Sciences",
issn = "0020-0255",
publisher = "Elsevier Inc.",
number = "6",

}

TY - JOUR

T1 - An efficient method for maintaining data cubes incrementally

AU - Lee, Ki Yong

AU - Chung, Yon Dohn

AU - Kim, Myoung Ho

PY - 2010/3/15

Y1 - 2010/3/15

N2 - The data cube operator computes group-bys for all possible combinations of a set of dimension attributes. Since computing a data cube typically incurs a considerable cost, the data cube is often precomputed and stored as materialized views in data warehouses. A materialized data cube needs to be updated when the source relations are changed. The incremental maintenance of a data cube is to compute and propagate only its changes, rather than recompute the entire data cube from scratch. For n dimension attributes, the data cube consists of 2n group-bys, each of which is called a cuboid. To incrementally maintain a data cube with 2n cuboids, the conventional methods compute 2n delta cuboids, each of which represents the change of a cuboid. In this paper, we propose an efficient incremental maintenance method that can maintain a data cube using only a subset of 2n delta cuboids. We formulate an optimization problem to find the optimal subset of 2n delta cuboids that minimizes the total maintenance cost, and propose a heuristic solution that allows us to maintain a data cube using only fenced((n; ⌈ n / 2 ⌉)) delta cuboids. As a result, the cost of maintaining a data cube is substantially reduced. Through various experiments, we show the performance advantages of the proposed method over the conventional methods. We also extend the proposed method to handle partially materialized cubes and dimension hierarchies.

AB - The data cube operator computes group-bys for all possible combinations of a set of dimension attributes. Since computing a data cube typically incurs a considerable cost, the data cube is often precomputed and stored as materialized views in data warehouses. A materialized data cube needs to be updated when the source relations are changed. The incremental maintenance of a data cube is to compute and propagate only its changes, rather than recompute the entire data cube from scratch. For n dimension attributes, the data cube consists of 2n group-bys, each of which is called a cuboid. To incrementally maintain a data cube with 2n cuboids, the conventional methods compute 2n delta cuboids, each of which represents the change of a cuboid. In this paper, we propose an efficient incremental maintenance method that can maintain a data cube using only a subset of 2n delta cuboids. We formulate an optimization problem to find the optimal subset of 2n delta cuboids that minimizes the total maintenance cost, and propose a heuristic solution that allows us to maintain a data cube using only fenced((n; ⌈ n / 2 ⌉)) delta cuboids. As a result, the cost of maintaining a data cube is substantially reduced. Through various experiments, we show the performance advantages of the proposed method over the conventional methods. We also extend the proposed method to handle partially materialized cubes and dimension hierarchies.

KW - Data cube

KW - Data warehouse

KW - Materialized view

KW - OLAP

UR - http://www.scopus.com/inward/record.url?scp=73149085051&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=73149085051&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2009.11.037

DO - 10.1016/j.ins.2009.11.037

M3 - Article

VL - 180

SP - 928

EP - 948

JO - Information Sciences

JF - Information Sciences

SN - 0020-0255

IS - 6

ER -