Hyperprov: Blockchain-based Data Provenance using Hyperledger Fabric
With data intensive computing helping advance state-of-the-art in varied fields, data provenance and lineage continue to remain formidable challenges in assisting with integrity and reproducibility in research and applications. This is particularly challenging for distributed scenarios, where data may be originating from decentralized sources without any centralized control by a single trusted entity. To date most of the data provenance systems are specific to particular domains, and are often centralized. Distributed ledgers such as blockchains have proved quite popular and effective in addressing trust and consensus without central control. There are a few recent proposals to employ blockchains for data provenance, however, they rely on currency in order to propose transactions using public blockchains. We present HyperProv, a general framework for data provenance based on the permissioned blockchain Hyperledger Fabric (HLF), and to the best of our knowledge, the first provenance system that is ported to ARM based devices such as Raspberry Pi (RPi). HyperProv records the operation history and data lineage by tracking checksums, editors, timestamps, data pointers, dependencies, and more. Provenance data is retrieved and stored through a NodeJS client library to simplify interactions with the blockchain. HyperProv has a set of built-in queries using smart contracts that enable lightweight retrieval of large collections of provenance data. We evaluate the throughput, latency and resource consumption of HyperProv on x86-64 desktop machines, as well as RPi, demonstrating the feasibility of using HyperProv on RPi for tamperproof data provenance, useful in particular for Internet of Things use cases.
PublisherUiT Norges arktiske universitet
UiT The Arctic University of Norway
The following license file are associated with this item: