Data Provenance and its Applications
Lunch will be served at 11:45 am.
Data Provenance, also referred to as Data Lineage, is metadata that describes from where a digital artifact came. People have argued that such metadata is useful for myriad applications such as reproducibility, forensic analysis, intrusion detection, data retention, regulatory compliance, and more. Unfortunately, the vast majority of work in the area focuses on standardization and collection, not applications. As a result, adoption of provenance in industry has been practically non existent.
I'll present a short background and history of research on data provenance followed by a discussion of some real applications that we've developed (are developing), some challenges in building powerful provenance-based applications, and speculation about avenues of further research.
SPEAKER BIO SUMMARY:
Margo Seltzer https://www.seltzer.com/margo/ is Canada 150 Research Chair in Computer Systems and the Cheriton Family chair in Computer Science at the University of British Columbia. Her research interests are in systems, construed quite broadly: systems for capturing and accessing data provenance, file systems, databases, transaction processing systems, storage and analysis of graph structured data, and systems for constructing optimal and interpretable machine learning models. Read more: https://bit.ly/dukecs-3apr2023