The Definitive Guide to Apache Flink
Next Generation Data Processing
(Sprache: Englisch)
Data Processing is one of the core functionalities of distributed and cloud computing. There is a high demand on low latency and high performance computing as well as the support of abstract processing methods such as SQL querying, analytic frameworks or...
Leider schon ausverkauft
versandkostenfrei
Buch
36.33 €
Produktdetails
Produktinformationen zu „The Definitive Guide to Apache Flink “
Klappentext zu „The Definitive Guide to Apache Flink “
Data Processing is one of the core functionalities of distributed and cloud computing. There is a high demand on low latency and high performance computing as well as the support of abstract processing methods such as SQL querying, analytic frameworks or graph processing by data processing engines.The Definitive Guide to Apache Flink by Papp starts with the history of Big Data processing with Hadoop and explains the shortcomings of Map Reduce. It shows how YARN and Hadoop 2.x changed the game and how new technologies started to compete to become the successor of Map Reduce.
After some detailed information on Tez and Spark and how they try to solve shortcomings of Map Reduce, this book deals with some architectural patterns for creating a solid data processing engine, such as advanced pipelining methods or in-memory caching. It shows how Flink is using these concepts.
Flink programming will be introduced in a hands-on approach. It starts with how to create a ten minutes build and how to run the first "Word Count" with Flink. Then it continues with more advanced topics such as programming more complex programs. All samples are programmed with Java or Scala.
It shows that Apache Flink has the potential to become one of the key technologies for distributed computing. It aims to replace many small technologies with a more powerful one that covers many aspects of Hadoop programming.
Inhaltsverzeichnis zu „The Definitive Guide to Apache Flink “
Table of ContentsChapter 1: Data Processing
Chapter Goal: Reader gets an overview on Data Processing in distributed environments
Sub -Topics
History of Data Processing
Shortcomings of MapReduce
IO-Problems
Why YARN changed the game
Chapter 2: Next Generation Data Processing Platform
Chapter Goal: Introduce the data processing platforms
Sub - Topics
Tez
Spark
In-Memory processing
Pipelines
Chapter 3: Ten Minutes Build
Chapter Goal: The reader can install Flink, creates a simple build and in general able to set up a Flink project.
Sub - Topics
Basic Setup
How to get started in a local environment
How to get started in a Hadoop Environment
Word Count
Chapter 4: Programming Essentials
Chapter Goal: The reader can write basic Flink Applications and understands how to set them up and has a good understanding on the data typens
Sub - Topics:
Your first Flink Application with Java
Your first Flink Application with Scala
Six steps to create a Flink programming
Understanding E xecutionEnvironment
Understanding DataSets and Tuples
Chapter 5: Transformation
Chapter Goal: List all types of transformations. The reader gets a comprehensive how to transform data with flink.
Sub -Topics:
Filtering, Joining
Aggregation
Chapter 6: Data Preparation with Flink
Chapter Goal: Reader learns how to prepare data for later analysis
Sub -Topics
ETL with Flink - Overview
How to access HCatalog
How to ingest several data types with Flink into Hadoop (JSON, csv, XML)
Chapter 7 : Data Analytics Basis
Chapter Goal: How to analyze data with Flink
Sub -Topics
K-Means and other statistical methods
Graph Analytics
Aggregation and statistics on weather data
Text Analytics: Sentiment Analysis with Flink
Chapter 8: Visualization
Chapter Goal: Reader learns how to display analysis results
Sub -Topics
Different types of charts
How to visualize results
Chapter 9: Streaming
Chapter
... mehr
Goal:
Reader learns how to stream data
Sub -Topics
Streaming how it works
Differences to storm
Performance
Chapter 10: Outlook
Chapter Goal: The future of Flink
Sub -Topics
Overview on the future
Reader learns how to stream data
Sub -Topics
Streaming how it works
Differences to storm
Performance
Chapter 10: Outlook
Chapter Goal: The future of Flink
Sub -Topics
Overview on the future
... weniger
Autoren-Porträt von Stefan Papp
Stefan Papp is an IT professional with 20 years experience who has dedicated his professional career to Big Data and Data Science. He focuses Hadoop technologies and consults major companies.
Bibliographische Angaben
- Autor: Stefan Papp
- 2016, 1st ed., 400 Seiten, Maße: 17,8 x 25,4 cm, Kartoniert (TB), Englisch
- Verlag: APress
- ISBN-10: 1484214080
- ISBN-13: 9781484214084
- Erscheinungsdatum: 08.06.2016
Sprache:
Englisch
Kommentar zu "The Definitive Guide to Apache Flink"
0 Gebrauchte Artikel zu „The Definitive Guide to Apache Flink“
Zustand | Preis | Porto | Zahlung | Verkäufer | Rating |
---|
Schreiben Sie einen Kommentar zu "The Definitive Guide to Apache Flink".
Kommentar verfassen