Big Data pipelines are essential for leveraging Dark Data, i.e., data collected but not used and turned into value. However, tapping their potential requires going beyond the current approaches and frameworks for managing their life-cycle. In this paper, we present the challenges associated to the achievement of the Pipeline Discovery task, which aims to learn the structure of a Big Data pipeline by extracting, processing and interpreting huge amounts of event data produced by several data sources. Then, we discuss how traditional Process Mining solutions can be potentially employed and customized to overcome such challenges, outlining a research agenda for future work in this area.

Big Data Pipeline Discovery through Process Mining: Challenges and Research Directions

Simone Agostinelli;
2021-01-01

Abstract

Big Data pipelines are essential for leveraging Dark Data, i.e., data collected but not used and turned into value. However, tapping their potential requires going beyond the current approaches and frameworks for managing their life-cycle. In this paper, we present the challenges associated to the achievement of the Pipeline Discovery task, which aims to learn the structure of a Big Data pipeline by extracting, processing and interpreting huge amounts of event data produced by several data sources. Then, we discuss how traditional Process Mining solutions can be potentially employed and customized to overcome such challenges, outlining a research agenda for future work in this area.
2021
Big Data Pipeline
Pipeline Discovery
Process Mining
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12606/22617
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
social impact