Author: Valliappa Lakshmanan
Publisher: O'Reilly Media
ISBN: 1492044431
Category : Computers
Languages : en
Pages : 522
Book Description
Work with petabyte-scale datasets while building a collaborative, agile workplace in the process. This practical book is the canonical reference to Google BigQuery, the query engine that lets you conduct interactive analysis of large datasets. BigQuery enables enterprises to efficiently store, query, ingest, and learn from their data in a convenient framework. With this book, you’ll examine how to analyze data at scale to derive insights from large datasets efficiently. Valliappa Lakshmanan, tech lead for Google Cloud Platform, and Jordan Tigani, engineering director for the BigQuery team, provide best practices for modern data warehousing within an autoscaled, serverless public cloud. Whether you want to explore parts of BigQuery you’re not familiar with or prefer to focus on specific tasks, this reference is indispensable.
Google BigQuery: The Definitive Guide
Author: Valliappa Lakshmanan
Publisher: O'Reilly Media
ISBN: 1492044431
Category : Computers
Languages : en
Pages : 522
Book Description
Work with petabyte-scale datasets while building a collaborative, agile workplace in the process. This practical book is the canonical reference to Google BigQuery, the query engine that lets you conduct interactive analysis of large datasets. BigQuery enables enterprises to efficiently store, query, ingest, and learn from their data in a convenient framework. With this book, you’ll examine how to analyze data at scale to derive insights from large datasets efficiently. Valliappa Lakshmanan, tech lead for Google Cloud Platform, and Jordan Tigani, engineering director for the BigQuery team, provide best practices for modern data warehousing within an autoscaled, serverless public cloud. Whether you want to explore parts of BigQuery you’re not familiar with or prefer to focus on specific tasks, this reference is indispensable.
Publisher: O'Reilly Media
ISBN: 1492044431
Category : Computers
Languages : en
Pages : 522
Book Description
Work with petabyte-scale datasets while building a collaborative, agile workplace in the process. This practical book is the canonical reference to Google BigQuery, the query engine that lets you conduct interactive analysis of large datasets. BigQuery enables enterprises to efficiently store, query, ingest, and learn from their data in a convenient framework. With this book, you’ll examine how to analyze data at scale to derive insights from large datasets efficiently. Valliappa Lakshmanan, tech lead for Google Cloud Platform, and Jordan Tigani, engineering director for the BigQuery team, provide best practices for modern data warehousing within an autoscaled, serverless public cloud. Whether you want to explore parts of BigQuery you’re not familiar with or prefer to focus on specific tasks, this reference is indispensable.
Data Science on the Google Cloud Platform
Author: Valliappa Lakshmanan
Publisher: "O'Reilly Media, Inc."
ISBN: 109811891X
Category : Computers
Languages : en
Pages : 429
Book Description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP. Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way. You'll learn how to: Employ best practices in building highly scalable data and ML pipelines on Google Cloud Automate and schedule data ingest using Cloud Run Create and populate a dashboard in Data Studio Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery Conduct interactive data exploration with BigQuery Create a Bayesian model with Spark on Cloud Dataproc Forecast time series and do anomaly detection with BigQuery ML Aggregate within time windows with Dataflow Train explainable machine learning models with Vertex AI Operationalize ML with Vertex AI Pipelines
Publisher: "O'Reilly Media, Inc."
ISBN: 109811891X
Category : Computers
Languages : en
Pages : 429
Book Description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP. Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way. You'll learn how to: Employ best practices in building highly scalable data and ML pipelines on Google Cloud Automate and schedule data ingest using Cloud Run Create and populate a dashboard in Data Studio Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery Conduct interactive data exploration with BigQuery Create a Bayesian model with Spark on Cloud Dataproc Forecast time series and do anomaly detection with BigQuery ML Aggregate within time windows with Dataflow Train explainable machine learning models with Vertex AI Operationalize ML with Vertex AI Pipelines
Learning Google Analytics
Author: Mark Edmondson
Publisher: "O'Reilly Media, Inc."
ISBN: 1098113055
Category : Computers
Languages : en
Pages : 342
Book Description
Why is Google Analytics 4 the most modern data model available for digital marketing analytics? Because rather than simply report what has happened, GA4's new cloud integrations enable more data activation—linking online and offline data across all your streams to provide end-to-end marketing data. This practical book prepares you for the future of digital marketing by demonstrating how GA4 supports these additional cloud integrations. Author Mark Edmondson, Google Developer Expert for Google Analytics and Google Cloud, provides a concise yet comprehensive overview of GA4 and its cloud integrations. Data, business, and marketing analysts will learn major facets of GA4's powerful new analytics model, with topics including data architecture and strategy, and data ingestion, storage, and modeling. You'll explore common data activation use cases and get guidance on how to implement them. You'll learn: How Google Cloud integrates with GA4 The potential use cases that GA4 integrations can enable Skills and resources needed to create GA4 integrations How much GA4 data capture is necessary to enable use cases The process of designing dataflows from strategy though data storage, modeling, and activation
Publisher: "O'Reilly Media, Inc."
ISBN: 1098113055
Category : Computers
Languages : en
Pages : 342
Book Description
Why is Google Analytics 4 the most modern data model available for digital marketing analytics? Because rather than simply report what has happened, GA4's new cloud integrations enable more data activation—linking online and offline data across all your streams to provide end-to-end marketing data. This practical book prepares you for the future of digital marketing by demonstrating how GA4 supports these additional cloud integrations. Author Mark Edmondson, Google Developer Expert for Google Analytics and Google Cloud, provides a concise yet comprehensive overview of GA4 and its cloud integrations. Data, business, and marketing analysts will learn major facets of GA4's powerful new analytics model, with topics including data architecture and strategy, and data ingestion, storage, and modeling. You'll explore common data activation use cases and get guidance on how to implement them. You'll learn: How Google Cloud integrates with GA4 The potential use cases that GA4 integrations can enable Skills and resources needed to create GA4 integrations How much GA4 data capture is necessary to enable use cases The process of designing dataflows from strategy though data storage, modeling, and activation
The Definitive Guide to Data Integration
Author: Pierre-Yves BONNEFOY
Publisher: Packt Publishing Ltd
ISBN: 1837634777
Category : Computers
Languages : en
Pages : 490
Book Description
Learn the essentials of data integration with this comprehensive guide, covering everything from sources to solutions, and discover the key to making the most of your data stack Key Features Learn how to leverage modern data stack tools and technologies for effective data integration Design and implement data integration solutions with practical advice and best practices Focus on modern technologies such as cloud-based architectures, real-time data processing, and open-source tools and technologies Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionThe Definitive Guide to Data Integration is an indispensable resource for navigating the complexities of modern data integration. Focusing on the latest tools, techniques, and best practices, this guide helps you master data integration and unleash the full potential of your data. This comprehensive guide begins by examining the challenges and key concepts of data integration, such as managing huge volumes of data and dealing with the different data types. You’ll gain a deep understanding of the modern data stack and its architecture, as well as the pivotal role of open-source technologies in shaping the data landscape. Delving into the layers of the modern data stack, you’ll cover data sources, types, storage, integration techniques, transformation, and processing. The book also offers insights into data exposition and APIs, ingestion and storage strategies, data preparation and analysis, workflow management, monitoring, data quality, and governance. Packed with practical use cases, real-world examples, and a glimpse into the future of data integration, The Definitive Guide to Data Integration is an essential resource for data eclectics. By the end of this book, you’ll have the gained the knowledge and skills needed to optimize your data usage and excel in the ever-evolving world of data.What you will learn Discover the evolving architecture and technologies shaping data integration Process large data volumes efficiently with data warehousing Tackle the complexities of integrating large datasets from diverse sources Harness the power of data warehousing for efficient data storage and processing Design and optimize effective data integration solutions Explore data governance principles and compliance requirements Who this book is for This book is perfect for data engineers, data architects, data analysts, and IT professionals looking to gain a comprehensive understanding of data integration in the modern era. Whether you’re a beginner or an experienced professional enhancing your knowledge of the modern data stack, this definitive guide will help you navigate the data integration landscape.
Publisher: Packt Publishing Ltd
ISBN: 1837634777
Category : Computers
Languages : en
Pages : 490
Book Description
Learn the essentials of data integration with this comprehensive guide, covering everything from sources to solutions, and discover the key to making the most of your data stack Key Features Learn how to leverage modern data stack tools and technologies for effective data integration Design and implement data integration solutions with practical advice and best practices Focus on modern technologies such as cloud-based architectures, real-time data processing, and open-source tools and technologies Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionThe Definitive Guide to Data Integration is an indispensable resource for navigating the complexities of modern data integration. Focusing on the latest tools, techniques, and best practices, this guide helps you master data integration and unleash the full potential of your data. This comprehensive guide begins by examining the challenges and key concepts of data integration, such as managing huge volumes of data and dealing with the different data types. You’ll gain a deep understanding of the modern data stack and its architecture, as well as the pivotal role of open-source technologies in shaping the data landscape. Delving into the layers of the modern data stack, you’ll cover data sources, types, storage, integration techniques, transformation, and processing. The book also offers insights into data exposition and APIs, ingestion and storage strategies, data preparation and analysis, workflow management, monitoring, data quality, and governance. Packed with practical use cases, real-world examples, and a glimpse into the future of data integration, The Definitive Guide to Data Integration is an essential resource for data eclectics. By the end of this book, you’ll have the gained the knowledge and skills needed to optimize your data usage and excel in the ever-evolving world of data.What you will learn Discover the evolving architecture and technologies shaping data integration Process large data volumes efficiently with data warehousing Tackle the complexities of integrating large datasets from diverse sources Harness the power of data warehousing for efficient data storage and processing Design and optimize effective data integration solutions Explore data governance principles and compliance requirements Who this book is for This book is perfect for data engineers, data architects, data analysts, and IT professionals looking to gain a comprehensive understanding of data integration in the modern era. Whether you’re a beginner or an experienced professional enhancing your knowledge of the modern data stack, this definitive guide will help you navigate the data integration landscape.
Practical MLOps
Author: Noah Gift
Publisher: "O'Reilly Media, Inc."
ISBN: 1098102967
Category : Computers
Languages : en
Pages : 467
Book Description
Getting your models into production is the fundamental challenge of machine learning. MLOps offers a set of proven principles aimed at solving this problem in a reliable and automated way. This insightful guide takes you through what MLOps is (and how it differs from DevOps) and shows you how to put it into practice to operationalize your machine learning models. Current and aspiring machine learning engineers--or anyone familiar with data science and Python--will build a foundation in MLOps tools and methods (along with AutoML and monitoring and logging), then learn how to implement them in AWS, Microsoft Azure, and Google Cloud. The faster you deliver a machine learning system that works, the faster you can focus on the business problems you're trying to crack. This book gives you a head start. You'll discover how to: Apply DevOps best practices to machine learning Build production machine learning systems and maintain them Monitor, instrument, load-test, and operationalize machine learning systems Choose the correct MLOps tools for a given machine learning task Run machine learning models on a variety of platforms and devices, including mobile phones and specialized hardware
Publisher: "O'Reilly Media, Inc."
ISBN: 1098102967
Category : Computers
Languages : en
Pages : 467
Book Description
Getting your models into production is the fundamental challenge of machine learning. MLOps offers a set of proven principles aimed at solving this problem in a reliable and automated way. This insightful guide takes you through what MLOps is (and how it differs from DevOps) and shows you how to put it into practice to operationalize your machine learning models. Current and aspiring machine learning engineers--or anyone familiar with data science and Python--will build a foundation in MLOps tools and methods (along with AutoML and monitoring and logging), then learn how to implement them in AWS, Microsoft Azure, and Google Cloud. The faster you deliver a machine learning system that works, the faster you can focus on the business problems you're trying to crack. This book gives you a head start. You'll discover how to: Apply DevOps best practices to machine learning Build production machine learning systems and maintain them Monitor, instrument, load-test, and operationalize machine learning systems Choose the correct MLOps tools for a given machine learning task Run machine learning models on a variety of platforms and devices, including mobile phones and specialized hardware
The Definitive Guide to Google Vertex AI
Author: Jasmeet Bhatia
Publisher: Packt Publishing Ltd
ISBN: 1801813329
Category : Computers
Languages : en
Pages : 422
Book Description
Implement machine learning pipelines with Google Cloud Vertex AI Key Features Understand the role of an AI platform and MLOps practices in machine learning projects Get acquainted with Google Vertex AI tools and offerings that help accelerate the creation of end-to-end ML solutions Implement Vision, NLP, and recommendation-based real-world ML models on Google Cloud Platform Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionWhile AI has become an integral part of every organization today, the development of large-scale ML solutions and management of complex ML workflows in production continue to pose challenges for many. Google’s unified data and AI platform, Vertex AI, directly addresses these challenges with its array of MLOPs tools designed for overall workflow management. This book is a comprehensive guide that lets you explore Google Vertex AI’s easy-to-advanced level features for end-to-end ML solution development. Throughout this book, you’ll discover how Vertex AI empowers you by providing essential tools for critical tasks, including data management, model building, large-scale experimentations, metadata logging, model deployments, and monitoring. You’ll learn how to harness the full potential of Vertex AI for developing and deploying no-code, low-code, or fully customized ML solutions. This book takes a hands-on approach to developing u deploying some real-world ML solutions on Google Cloud, leveraging key technologies such as Vision, NLP, generative AI, and recommendation systems. Additionally, this book covers pre-built and turnkey solution offerings as well as guidance on seamlessly integrating them into your ML workflows. By the end of this book, you’ll have the confidence to develop and deploy large-scale production-grade ML solutions using the MLOps tooling and best practices from Google.What you will learn Understand the ML lifecycle, challenges, and importance of MLOps Get started with ML model development quickly using Google Vertex AI Manage datasets, artifacts, and experiments Develop no-code, low-code, and custom AI solution on Google Cloud Implement advanced model optimization techniques and tooling Understand pre-built and turnkey AI solution offerings from Google Build and deploy custom ML models for real-world applications Explore the latest generative AI tools within Vertex AI Who this book is for If you are a machine learning practitioner who wants to learn end-to-end ML solution development on Google Cloud Platform using MLOps best practices and tools offered by Google Vertex AI, this is the book for you.
Publisher: Packt Publishing Ltd
ISBN: 1801813329
Category : Computers
Languages : en
Pages : 422
Book Description
Implement machine learning pipelines with Google Cloud Vertex AI Key Features Understand the role of an AI platform and MLOps practices in machine learning projects Get acquainted with Google Vertex AI tools and offerings that help accelerate the creation of end-to-end ML solutions Implement Vision, NLP, and recommendation-based real-world ML models on Google Cloud Platform Purchase of the print or Kindle book includes a free PDF eBook Book DescriptionWhile AI has become an integral part of every organization today, the development of large-scale ML solutions and management of complex ML workflows in production continue to pose challenges for many. Google’s unified data and AI platform, Vertex AI, directly addresses these challenges with its array of MLOPs tools designed for overall workflow management. This book is a comprehensive guide that lets you explore Google Vertex AI’s easy-to-advanced level features for end-to-end ML solution development. Throughout this book, you’ll discover how Vertex AI empowers you by providing essential tools for critical tasks, including data management, model building, large-scale experimentations, metadata logging, model deployments, and monitoring. You’ll learn how to harness the full potential of Vertex AI for developing and deploying no-code, low-code, or fully customized ML solutions. This book takes a hands-on approach to developing u deploying some real-world ML solutions on Google Cloud, leveraging key technologies such as Vision, NLP, generative AI, and recommendation systems. Additionally, this book covers pre-built and turnkey solution offerings as well as guidance on seamlessly integrating them into your ML workflows. By the end of this book, you’ll have the confidence to develop and deploy large-scale production-grade ML solutions using the MLOps tooling and best practices from Google.What you will learn Understand the ML lifecycle, challenges, and importance of MLOps Get started with ML model development quickly using Google Vertex AI Manage datasets, artifacts, and experiments Develop no-code, low-code, and custom AI solution on Google Cloud Implement advanced model optimization techniques and tooling Understand pre-built and turnkey AI solution offerings from Google Build and deploy custom ML models for real-world applications Explore the latest generative AI tools within Vertex AI Who this book is for If you are a machine learning practitioner who wants to learn end-to-end ML solution development on Google Cloud Platform using MLOps best practices and tools offered by Google Vertex AI, this is the book for you.
Architecting Data and Machine Learning Platforms
Author: Marco Tranquillin
Publisher: "O'Reilly Media, Inc."
ISBN: 1098151585
Category : Computers
Languages : en
Pages : 361
Book Description
All cloud architects need to know how to build data platforms that enable businesses to make data-driven decisions and deliver enterprise-wide intelligence in a fast and efficient way. This handbook shows you how to design, build, and modernize cloud native data and machine learning platforms using AWS, Azure, Google Cloud, and multicloud tools like Snowflake and Databricks. Authors Marco Tranquillin, Valliappa Lakshmanan, and Firat Tekiner cover the entire data lifecycle from ingestion to activation in a cloud environment using real-world enterprise architectures. You'll learn how to transform, secure, and modernize familiar solutions like data warehouses and data lakes, and you'll be able to leverage recent AI/ML patterns to get accurate and quicker insights to drive competitive advantage. You'll learn how to: Design a modern and secure cloud native or hybrid data analytics and machine learning platform Accelerate data-led innovation by consolidating enterprise data in a governed, scalable, and resilient data platform Democratize access to enterprise data and govern how business teams extract insights and build AI/ML capabilities Enable your business to make decisions in real time using streaming pipelines Build an MLOps platform to move to a predictive and prescriptive analytics approach
Publisher: "O'Reilly Media, Inc."
ISBN: 1098151585
Category : Computers
Languages : en
Pages : 361
Book Description
All cloud architects need to know how to build data platforms that enable businesses to make data-driven decisions and deliver enterprise-wide intelligence in a fast and efficient way. This handbook shows you how to design, build, and modernize cloud native data and machine learning platforms using AWS, Azure, Google Cloud, and multicloud tools like Snowflake and Databricks. Authors Marco Tranquillin, Valliappa Lakshmanan, and Firat Tekiner cover the entire data lifecycle from ingestion to activation in a cloud environment using real-world enterprise architectures. You'll learn how to transform, secure, and modernize familiar solutions like data warehouses and data lakes, and you'll be able to leverage recent AI/ML patterns to get accurate and quicker insights to drive competitive advantage. You'll learn how to: Design a modern and secure cloud native or hybrid data analytics and machine learning platform Accelerate data-led innovation by consolidating enterprise data in a governed, scalable, and resilient data platform Democratize access to enterprise data and govern how business teams extract insights and build AI/ML capabilities Enable your business to make decisions in real time using streaming pipelines Build an MLOps platform to move to a predictive and prescriptive analytics approach
Big Data for Big Decisions
Author: Krishna Pera
Publisher: CRC Press
ISBN: 1000816966
Category : Business & Economics
Languages : en
Pages : 282
Book Description
Building a data-driven organization (DDO) is an enterprise-wide initiative that may consume and lock up resources for the long term. Understandably, any organization considering such an initiative would insist on a roadmap and business case to be prepared and evaluated prior to approval. This book presents a step-by-step methodology in order to create a roadmap and business case, and provides a narration of the constraints and experiences of managers who have attempted the setting up of DDOs. The emphasis is on the big decisions – the key decisions that influence 90% of business outcomes – starting from decision first and reengineering the data to the decisions process-chain and data governance, so as to ensure the right data are available at the right time, every time. Investing in artificial intelligence and data-driven decision making are now being considered a survival necessity for organizations to stay competitive. While every enterprise aspires to become 100% data-driven and every Chief Information Officer (CIO) has a budget, Gartner estimates over 80% of all analytics projects fail to deliver intended value. Most CIOs think a data-driven organization is a distant dream, especially while they are still struggling to explain the value from analytics. They know a few isolated successes, or a one-time leveraging of big data for decision making does not make an organization data-driven. As of now, there is no precise definition for data-driven organization or what qualifies an organization to call itself data-driven. Given the hype in the market for big data, analytics and AI, every CIO has a budget for analytics, but very little clarity on where to begin or how to choose and prioritize the analytics projects. Most end up investing in a visualization platform like Tableau or QlikView, which in essence is an improved version of their BI dashboard that the organization had invested into not too long ago. The most important stakeholders, the decision-makers, are rarely kept in the loop while choosing analytics projects. This book provides a fail-safe methodology for assured success in deriving intended value from investments into analytics. It is a practitioners’ handbook for creating a step-by-step transformational roadmap prioritizing the big data for the big decisions, the 10% of decisions that influence 90% of business outcomes, and delivering material improvements in the quality of decisions, as well as measurable value from analytics investments. The acid test for a data-driven organization is when all the big decisions, especially top-level strategic decisions, are taken based on data and not on the collective gut feeling of the decision makers in the organization.
Publisher: CRC Press
ISBN: 1000816966
Category : Business & Economics
Languages : en
Pages : 282
Book Description
Building a data-driven organization (DDO) is an enterprise-wide initiative that may consume and lock up resources for the long term. Understandably, any organization considering such an initiative would insist on a roadmap and business case to be prepared and evaluated prior to approval. This book presents a step-by-step methodology in order to create a roadmap and business case, and provides a narration of the constraints and experiences of managers who have attempted the setting up of DDOs. The emphasis is on the big decisions – the key decisions that influence 90% of business outcomes – starting from decision first and reengineering the data to the decisions process-chain and data governance, so as to ensure the right data are available at the right time, every time. Investing in artificial intelligence and data-driven decision making are now being considered a survival necessity for organizations to stay competitive. While every enterprise aspires to become 100% data-driven and every Chief Information Officer (CIO) has a budget, Gartner estimates over 80% of all analytics projects fail to deliver intended value. Most CIOs think a data-driven organization is a distant dream, especially while they are still struggling to explain the value from analytics. They know a few isolated successes, or a one-time leveraging of big data for decision making does not make an organization data-driven. As of now, there is no precise definition for data-driven organization or what qualifies an organization to call itself data-driven. Given the hype in the market for big data, analytics and AI, every CIO has a budget for analytics, but very little clarity on where to begin or how to choose and prioritize the analytics projects. Most end up investing in a visualization platform like Tableau or QlikView, which in essence is an improved version of their BI dashboard that the organization had invested into not too long ago. The most important stakeholders, the decision-makers, are rarely kept in the loop while choosing analytics projects. This book provides a fail-safe methodology for assured success in deriving intended value from investments into analytics. It is a practitioners’ handbook for creating a step-by-step transformational roadmap prioritizing the big data for the big decisions, the 10% of decisions that influence 90% of business outcomes, and delivering material improvements in the quality of decisions, as well as measurable value from analytics investments. The acid test for a data-driven organization is when all the big decisions, especially top-level strategic decisions, are taken based on data and not on the collective gut feeling of the decision makers in the organization.
Fundamentals of Data Engineering
Author: Joe Reis
Publisher: "O'Reilly Media, Inc."
ISBN: 1098108272
Category : Computers
Languages : en
Pages : 446
Book Description
Data engineering has grown rapidly in the past decade, leaving many software engineers, data scientists, and analysts looking for a comprehensive view of this practice. With this practical book, you'll learn how to plan and build systems to serve the needs of your organization and customers by evaluating the best technologies available through the framework of the data engineering lifecycle. Authors Joe Reis and Matt Housley walk you through the data engineering lifecycle and show you how to stitch together a variety of cloud technologies to serve the needs of downstream data consumers. You'll understand how to apply the concepts of data generation, ingestion, orchestration, transformation, storage, and governance that are critical in any data environment regardless of the underlying technology. This book will help you: Get a concise overview of the entire data engineering landscape Assess data engineering problems using an end-to-end framework of best practices Cut through marketing hype when choosing data technologies, architecture, and processes Use the data engineering lifecycle to design and build a robust architecture Incorporate data governance and security across the data engineering lifecycle
Publisher: "O'Reilly Media, Inc."
ISBN: 1098108272
Category : Computers
Languages : en
Pages : 446
Book Description
Data engineering has grown rapidly in the past decade, leaving many software engineers, data scientists, and analysts looking for a comprehensive view of this practice. With this practical book, you'll learn how to plan and build systems to serve the needs of your organization and customers by evaluating the best technologies available through the framework of the data engineering lifecycle. Authors Joe Reis and Matt Housley walk you through the data engineering lifecycle and show you how to stitch together a variety of cloud technologies to serve the needs of downstream data consumers. You'll understand how to apply the concepts of data generation, ingestion, orchestration, transformation, storage, and governance that are critical in any data environment regardless of the underlying technology. This book will help you: Get a concise overview of the entire data engineering landscape Assess data engineering problems using an end-to-end framework of best practices Cut through marketing hype when choosing data technologies, architecture, and processes Use the data engineering lifecycle to design and build a robust architecture Incorporate data governance and security across the data engineering lifecycle