Practical Enterprise Data Lake Insights

Lieferzeit: Lieferbar innerhalb 14 Tagen

53,49 

Handle Data-Driven Challenges in an Enterprise Big Data Lake

ISBN: 1484235215
ISBN 13: 9781484235218
Autor: Gupta, Saurabh/Giri, Venkata
Verlag: APress
Umfang: xviii, 327 S., 90 s/w Illustr., 327 p. 90 illus.
Erscheinungsdatum: 28.06.2018
Auflage: 1/2018
Produktform: Kartoniert
Einband: KT

Use this practical guide to successfully handle the challenges encountered when designing an enterprise data lake and learn industry best practices to resolve issues. When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data. Starting from sourcing data into the Hadoop ecosystem, you will go through stages that can bring up tough questions such as data processing, data querying, and security. Concepts such as change data capture and data streaming are covered. The book takes an end-to-end solution approach in a data lake environment that includes data security, high availability, data processing, data streaming, and more. Each chapter includes application of a concept, code snippets, and use case demonstrations to provide you with a practical approach. You will learn the concept, scope, application, and starting point. What You’ll Learn: – Get to know data lake architecture and design principles Implement data capture and streaming strategies Implement data processing strategies in Hadoop Understand the data lake security framework and availability model

Artikelnummer: 3549342 Kategorie:

Beschreibung

Use this practical guide to successfully handle the challenges encountered when designing an enterprise data lake and learn industry best practices to resolve issues. When designing an enterprise data lake you often hit a roadblock when you must leave the comfort of the relational world and learn the nuances of handling non-relational data. Starting from sourcing data into the Hadoop ecosystem, you will go through stages that can bring up tough questions such as data processing, data querying, and security. Concepts such as change data capture and data streaming are covered. The book takes an end-to-end solution approach in a data lake environment that includes data security, high availability, data processing, data streaming, and more. Each chapter includes application of a concept, code snippets, and use case demonstrations to provide you with a practical approach. You will learn the concept, scope, application, and starting point. What You'll Learn - Get to know data lake architecture and design principles Implement data capture and streaming strategies Implement data processing strategies in Hadoop Understand the data lake security framework and availability model Who This Book Is For Big data architects and solution architects

Autorenporträt

Saurabh K. Gupta is a technology leader, published author, and database enthusiast with more than 11 years of industry experience in data architecture, engineering, development, and administration. Working as a Manager, Data & Analytics at GE Transportation, his focus lies with data lake analytics programs that build a digital solution for business stakeholders. In the past, he has worked extensively with Oracle Database design and development, PaaS and IaaS cloud service models, database development, database consolidation, and in-memory technologies. He has authored two books on advanced PL/SQL for Oracle versions 11g and 12c. He is a frequent speaker at numerous conferences organized by the user community and technical institutions. He tweets at @saurabhkg and blogs at sbhoracle.wordpress.com. Venkata Giri currently works with GE Digital and has been involved with building resilient distributed services at a massive scale. He uses technologies such as Hadoop/HDFS, Hive, Pig, Oozie, Spark, SQOOP, and Presto to enable data science, analytics, and marketing teams to produce the best products with the required information. With over 20 years of experience in database technologies, he has in-depth knowledge of big data ecosystems, complex data ingestion pipelines, data processing, and operations. He has a good understanding of SDLC with strong technical skills in Oracle, SQL, and PL/SQL. He has designed, implemented, and managed LinkedIn's multi-colo (active-active) project (database) from proof-of-concept to production.

Das könnte Ihnen auch gefallen …