A cost model for long-term compressed data retention

Conference item


Liao, Kewen, Moffat, Alistair, Petri, Matthias and Wirth, Anthony. (2017). A cost model for long-term compressed data retention. 10th International Conference on Web Search and Data Mining. United States of America: Association for Computing Machinery. pp. 241 - 249 https://doi.org/10.1145/3018661.3018738
AuthorsLiao, Kewen, Moffat, Alistair, Petri, Matthias and Wirth, Anthony
Abstract

Vast amounts of data are collected and stored every day, as part of corporate knowledge bases and as a response to legislative compliance requirements. To reduce the cost of retaining such data, compression tools are often applied. But simply seeking the best compression ratio is not necessarily the most economical choice, and other factors also come in to play, including compression and decompression throughput, the main memory required to support a given level of on-going access to the stored data, and the types of storage available. Here we develop a model for the total retention cost (TRC) of a data archiving regime, and by applying the charging rates associated with a cloud computing provider, are able to derive dollar amounts for a range of compression options, and hence guide the development of new approaches that are more cost-effective than current mechanisms. In particular, we describe an enhancement to the Relative Lempel Ziv (RLZ) compression scheme, and show that in terms of TRC, it outperforms previous approaches in terms of providing economical long-term data retention

Year2017
JournalProceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM '17)
PublisherAssociation for Computing Machinery
Digital Object Identifier (DOI)https://doi.org/10.1145/3018661.3018738
Publisher's version
File Access Level
Controlled
Page range241 - 249
Research GroupPeter Faber Business School
Place of publicationUnited States of America
Permalink -

https://acuresearchbank.acu.edu.au/item/863v2/a-cost-model-for-long-term-compressed-data-retention

Restricted files

Publisher's version

  • 6
    total views
  • 0
    total downloads
  • 0
    views this month
  • 0
    downloads this month
These values are for the period from 19th October 2020, when this repository was created.

Export as

Related outputs

Understanding the effects of real-time sentiment analysis and morale visualisation in backchannel systems : A case study
Wyeld, Theodor, Jiranantanagorn, Peerumporn, Shen, Haifeng, Liao, Kewen and Bednarz, Tomasz. (2021). Understanding the effects of real-time sentiment analysis and morale visualisation in backchannel systems : A case study. International Journal of Human-Computer Studies. 145, p. Article 102524. https://doi.org/10.1016/j.ijhcs.2020.102524
Human-AI interactive and continuous sensemaking : A case study of image classification using scribble attention maps
Shen, Haifeng, Liao, Kewen, Liao, Zhibin, Doornberg, Job, Qiao, Maoying, van den Hengel, Anton and Verjans, Johan W.. (2021). Human-AI interactive and continuous sensemaking : A case study of image classification using scribble attention maps. CHI Conference on Human Factors in Computing Systems. Virtual 08 - 13 May 2021 pp. 1-8 https://doi.org/10.1145/3411763.3451798
Inferring location types with geo-social-temporal pattern mining
Anwar, Tarique, Liao, Kewen, Goyal, Angelic, Sellis, Timos, Kayes, A. S. M. and Shen, Haifeng. (2020). Inferring location types with geo-social-temporal pattern mining. IEEE Access. 8, pp. 154789-154799. https://doi.org/10.1109/ACCESS.2020.3018997
Temporal betweenness centrality in dynamic graphs
Tsalouchidou, Ioanna, Baeza-Yates, Ricardo, Bonchi, Francesco, Liao, Kewen and Sellis, Timos. (2020). Temporal betweenness centrality in dynamic graphs. International Journal of Data Science and Analytics. 9(3), pp. 257 - 272. https://doi.org/10.1007/s41060-019-00189-x
Performance-aware cost-effective resource provisioning for future grid iot-cloud system
Li, Weiling, Liao, Kewen, He, Qiang and Xia, Yunni. (2019). Performance-aware cost-effective resource provisioning for future grid iot-cloud system. Journal of Energy Engineering. 145(5), pp. 2-13. https://doi.org/10.1061/(ASCE)EY.1943-7897.0000611
On effective and efficient graph edge labeling
Goonetilleke, Oshini, Koutra, Danai, Liao, Kewen and Sellis, Timos. (2019). On effective and efficient graph edge labeling. Distributed and Parallel Databases. 37(1), pp. 5-38. https://doi.org/10.1007/s10619-018-7234-4
Contextual community search over large social networks
Chen, Lu, Liu, Chengfei, Liao, Kewen, Li, Jianxin and Zhou, Rui. (2019). Contextual community search over large social networks. 35th International Conference on Data Engineering (ICDE). Macao, China: IEEE Computer Society. pp. 88 - 99 https://doi.org/10.1109/ICDE.2019.00017
Efficient algorithms for flexible sweep coverage in crowdsensing
Huang, Peihuang, Zhu, Wenxing, Liao, Kewen, Sellis, Timos, Yu, Zhiyong and Guo, Longkun. (2018). Efficient algorithms for flexible sweep coverage in crowdsensing. IEEE Access. 6, pp. 50055-50065. https://doi.org/10.1109/ACCESS.2018.2868931
Classification and annotation of open internet of things datastreams
Montori, Federico, Liao, Kewen, Jayaraman, Prem Prakash, Bononi, Luciano, Sellis, Timos and Georgakopoulos, Dimitrios. (2018). Classification and annotation of open internet of things datastreams. 19th International Conference, Web Information Systems Engineering, WISE 2018. Dubai, United Arab Emirates 12 2018 - 15 Nov 2019 Springer. pp. 209-224 https://doi.org/10.1007/978-3-030-02925-8_15
A fast algorithm for optimally finding partially disjoint shortest paths
Guo, Longkun, Deng, Yunyun, Liao, Kewen, He, Qiang, Sellis, Timos and Hu, Zheshan. (2018). A fast algorithm for optimally finding partially disjoint shortest paths. Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), Stockholm, Sweden, July 13-19, 2018. Stockholm, Sweden: International Joint Conferences on Artificial Intelligence. pp. 1456 - 1462 https://doi.org/10.24963/ijcai.2018/202
Edge labeling schemes for graph data
Goonetilleke, Oshini, Koutra, Danai, Sellis, Timos and Liao, Kewen. (2017). Edge labeling schemes for graph data. 29th International Conference on Scientific and Statistical Database Management. United States of America: Association for Computing Machinery. pp. 1 - 12 https://doi.org/10.1145/3085504.3085516
Effective construction of relative Lempel-Ziv dictionaries
Liao, Kewen, Petri, Matthias, Moffat, Alistair and Wirth, Anthony. (2016). Effective construction of relative Lempel-Ziv dictionaries. WWW '16: 25th International World Wide Web Conference. Switzerland: Association for Computing Machinery. pp. 807 - 816 https://doi.org/10.1145/2872427.2883042
Improved approximation algorithms for constrained fault-tolerant resource allocation
Liao, Kewen, Shen, Hong and Guo, Longkun. (2015). Improved approximation algorithms for constrained fault-tolerant resource allocation. Theoretical Computer Science. 590, pp. 118-128. https://doi.org/10.1016/j.tcs.2015.02.029
Improved approximation algorithms for computing k disjoint paths subject to two constraints
Guo, Longkun, Shen, Hong and Liao, Kewen. (2015). Improved approximation algorithms for computing k disjoint paths subject to two constraints. Journal of Combinatorial Optimization. 29(1), pp. 153-164. https://doi.org/10.1007/s10878-013-9693-x
Brief announcement : Efficient approximation algorithms for computing k disjoint restricted shortest paths
Guo, Longkun, Liao, Kewen, Shen, Hong and Li, Peng. (2015). Brief announcement : Efficient approximation algorithms for computing k disjoint restricted shortest paths. 27th ACM Symposium on Parallelism in Algorithms and Architectures. Portland, Oregon, United States of America 13 - 15 Jun 2015 Association for Computing Machinery. pp. 62–64 https://doi.org/10.1145/2755573.2755608
LP-based approximation algorithms for reliable resource allocation
Liao, Kewen and Shen, Hong. (2014). LP-based approximation algorithms for reliable resource allocation. The Computer Journal. 57(1), pp. 154-164. https://doi.org/10.1093/comjnl/bxs164
On the shallow-light Steiner tree problem
Guo, Longkun, Liao, Kewen and Shen, Hong. (2014). On the shallow-light Steiner tree problem. 15th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 2014). Hong Kong, China 09 - 14 Dec 2014 IEEE Computer Society. pp. 56-60 https://doi.org/10.1109/PDCAT.2014.17
Effect of off-zenith observations on reducing the impact of precipitation on ground-based microwave radiometer measurement accuracy
Xu, Guirong, Stick Ware, Randolph, Zhang, Wengang, Feng, Guangliu, Liao, Kewen and Liu, Yibing. (2014). Effect of off-zenith observations on reducing the impact of precipitation on ground-based microwave radiometer measurement accuracy. Atmospheric Research. 140-141, pp. 85 - 94. https://doi.org/10.1016/j.atmosres.2014.01.021