What is the Data Value Chain?
The data value chain describes the full data lifecycle from collection to analysis and usage. In other words, it categorizes all of the various steps required to transform raw data into useful insights. As explained on Open Data Watch, “The value chain describes connections between each step that change low-value inputs into high-value outputs. Although it has a logical flow, from start to finish, a value chain has no theory: it is a pragmatic construct.”
Though the terminology used to label the various components of the data value pipeline can vary from institution to institution, typically the data value chain is broken down into 5 key categories:
- Data Capture & Acquisition - This refers to the collection of raw data from both internal and external sources. The first phase of data collection involves identifying what data to collect and then establishing a process to do so (i.e. conducting a survey or retrieving automated IoT data). Decisions made here will affect the quality and usability of data throughout its life-cycle.
- Data Processing & Cleansing - Bad data in equals bad insights out so, once data is collected, it must be, processes organized and cleansed. This involves cleaning data - identifying and correcting corrupt, inaccurate, or irrelevant data - as well as converting raw data into a format that is usable, integratable and machine readable.
- Data Curation, Integration & Enrichment - Data curation and integration refers to the collection of processes required to merge data from multiple sources into one, cohesive dataset. During this process, data is also enriched, meaning that contextual metadata (the data that makes larger datasets discoverable) is added or updated.
- Data Analysis - Now that data has been cleansed, labeled and is primed for usage, the real fun can begin. Datasets can now be analyzed and used to uncover trends, patterns and other insights that can enhance decision making.
- Data ROI or Monetization - The final step of the process is the application of data analytics processes to solve real-world problems and, in a business setting, increase revenue. This can be done by either using data analytics to optimize the efficiency internal operations and decrease overhead costs or by using data-driven insights to identify and exploit new revenue streams.
In addition, the data value chain is more than just an outline of technical steps, achieving ROI with data requires significant cultural changes as well. Cultivating data literacy amongst non-technical users and promoting data democratization are also key parts of the success equation.
*Image sourced from "The Data Value Chain: Moving from Production to Impact" - https://opendatawatch.com/publications/the-data-value-chain-moving-from-production-to-impact/
Data Value Chain Optimization
By outlining and visualizing you own data value chain, it can help you identify performance gaps as well as establish a vision for your future state.
When it comes to unlocking the power of advanced analytics and artificial intelligence (AI), maximizing the performance of your data value chain is vital. The infrastructure required to support such initiatives must be high velocity and deliver low, predictable latency in both capturing data and in executing queries. It also has to be able to handle very high transaction volumes, often in a distributed environment as well as support flexible and dynamic data structures.