Summary

In this chapter, we brought together all the technologies and capabilities that we have discussed throughout Part 2 of this book. We tried to explain some important aspects with the whole Data Lake in mind. We introduced you to certain more capabilities like metadata management, governance, auditing, traceability and so on, which are very important one for a typical implementation within an enterprise. We managed to give our technology opinions for each of these capabilities but kept delving deep into it away. We were not able to get deep into some of the technologies discussed in this chapter intentionally to keep the book concise and to the point on main technologies/capabilities in a Data Lake.

After reading this chapter, you would now have a full picture of an operational Data Lake. You would also have brief idea of some other capabilities needed for an enterprise Data Lake, which are usually omitted when a Data Lake is first implemented in an enterprise.

These additional capabilities are required for a true Data Lake, but to cover the scope of the book and to stay within the limit we have to let it go by giving just the right amount of details. We haven't covered much of code in this chapter. Some of the choices of technology are just our opinions. Please take these with a grain of salt. Having said that, we encourage you to build these capabilities in your Data Lake implementation and not omit these.

This chapter was quite an ask as we covered many diverse aspects in brief and can be quite exhausting at this moment. Take a break, and let's come back and complete next two chapter quite quickly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.65.130