Data Lake vs Data Warehouse
I'd been using the terms data lake and data warehouse pretty much interchangeably until I recently learned the difference. In case others are in the same boat, I wanted to share what I learned:
A data lake refers to a digital location where raw, unstructured data is stored.
A data warehouse, by contrast, is a digital store of structure data.
Here's a link to the article that first laid out this difference for me:
Check it out and let me know what you think.
Do you agree with the other differences they draw between data lakes and warehouses, e.g. that data warehouses house data that's been preprocessed for a particular purpose, and are more likely to be used by business intelligence analysts than data scientists? In my experience data scientists use data warehouses all the time, so I'm skeptical about that one.
What about the difference in cost that they describe? Do you find data warehouses to be costlier to update than data lakes?
Do you think there are other key differences?
Field Data Scientist @ Domino