Export to
Wednesday, August 28, 2019 at 9:21pm.
DAMA PDX Special Session: Making Data Lakes more Reliable with Apache Spark and Delta Lakes
Axian
9600 SW Nimbus Ave
Suite 200
Beaverton, OR 97008
Website
Description
Making Data Lakes more Reliable with Apache Spark and Delta Lakes
Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs. In this talk, we will cover • All technical aspects of Delta Features • What’s coming • How to get started using it • How to contribute
About the Speaker
Tathagata Das is an Apache Spark committer and a member of the PMC. He’s the lead developer behind Spark Streaming and currently works on Delta Lake and Structured Streaming. Previously, he was a grad student in the UC Berkeley at AMPLab, where he conducted research about data-center frameworks and networks with Scott Shenker and Ion Stoica.