Loading...
A cost model for long-term compressed data retention
Liao, Kewen ; Moffat, Alistair ; Petri, Matthias ; Wirth, Anthony
Liao, Kewen
Moffat, Alistair
Petri, Matthias
Wirth, Anthony
Abstract
Vast amounts of data are collected and stored every day, as part of corporate knowledge bases and as a response to legislative compliance requirements. To reduce the cost of retaining such data, compression tools are often applied. But simply seeking the best compression ratio is not necessarily the most economical choice, and other factors also come in to play, including compression and decompression throughput, the main memory required to support a given level of on-going access to the stored data, and the types of storage available. Here we develop a model for the total retention cost (TRC) of a data archiving regime, and by applying the charging rates associated with a cloud computing provider, are able to derive dollar amounts for a range of compression options, and hence guide the development of new approaches that are more cost-effective than current mechanisms. In particular, we describe an enhancement to the Relative Lempel Ziv (RLZ) compression scheme, and show that in terms of TRC, it outperforms previous approaches in terms of providing economical long-term data retention
Keywords
Date
2017
Type
Conference item
Journal
Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM '17)
Book
Volume
Issue
Page Range
241-249
Article Number
ACU Department
Peter Faber Business School
Faculty of Law and Business
Faculty of Law and Business
Collections
Relation URI
Source URL
Event URL
Open Access Status
License
File Access
Controlled
