For Individuals

Data Lakes and Analytics on Cloud

December 3, 2020, 10:30 AM IST


For years, companies have dumped data into their data lake and deferred organizing the data until a later period. To make this data useful for modern analytics, the data must be carefully structured and cataloged. And as more data is rapidly introduced, from log files and IoT sources due to volume, velocity, and variety, data must also be structured in the same way in the data lake. Delta Lake on Databricks provides a way to streamline these data pipelines so the data is instantly available for analysis.

  • Understand what was a raw data lake & how the ACID transaction has become important for modern data lake using the features of Delta Lake.
  • Understand delta lake time travel and schema enforcement.
  • Streaming and Batch unification using delta lake.
  • Get an overview of Spark’s metadata management in delta lake.
  • Understand what can go wrong with core traditional data lake and how Delta lake can prevent a data lake to become a data swamp.


Anirban Ghatak

Senior Consultant (Data Science and Big Data Engineering), StackRoute

Join Thousands of Other StackRoute Followers!

Get notified about the next update