Moving Data
How to Move, Share, and Integrate SQL, NoSQL, and Big Data
(Sprache: Englisch)
Databases today are not stand-alone silos of discrete data. Instead, data is moved and shared between multiple databases and systems. Some of this is performed for operational reasons, such as when you need to take advantage of NoSQL databases for web...
Leider schon ausverkauft
versandkostenfrei
Buch
40.61 €
Produktdetails
Produktinformationen zu „Moving Data “
Klappentext zu „Moving Data “
Databases today are not stand-alone silos of discrete data. Instead, data is moved and shared between multiple databases and systems. Some of this is performed for operational reasons, such as when you need to take advantage of NoSQL databases for web performance, while at the same time storing the long-term data in traditional RDBMS. At other times, you need to migrate data between systems. Finally - and more often now than ever - you need to move data between systems to take advantage of specific features, such as merging data into Big Data stores (like Hadoop) for analytics, then moving the data back into a traditional RDBMS or NoSQL environment for display and analysis. Moving data can be difficult. Yet, as Moving Data: How to Move, Share, and Integrate SQL, NoSQL, and Big Data shows, there's always a method to accomplish the task at hand. Database expert MC Brown covers a range of solutions and techniques for moving, migrating, sharing, and integrating data between different databases and environments. He also shows how to transform the data between different representation formats, and how to automate or set up live replication of this data between the different databases systems.
The information couldn't be more timely, as the need to move data around to accomplish specific tasks continues to grow with no letup in sight. As a result, the specialized skills you'll learn from this book are soon going to be required for database administrators, IT professionals, and database architects, among others. You'll learn how to:
Migrate data to and from RDBMS Move data from RDBMS to NoSQL databases
Move or migrate data into or out of Big Data stores
Integrate Big Data sources with applications and databases
Best, the book is specially designed to provide both generic advice for each database solution, and more specific advice for different combinations. For example, there is generic advice about how to handle the idiomatic structure of an RDBMS, but also more
... mehr
specific advice for permanently migrating data from an RDBMS to a NoSQL store, for replicating data between these two stores, and finally for using both systems concurrently.
Moving Data covers territory found in no other book; it is an essential go-to reference for anyone charged with moving data among systems successfully.
Moving Data covers territory found in no other book; it is an essential go-to reference for anyone charged with moving data among systems successfully.
... weniger
Inhaltsverzeichnis zu „Moving Data “
Chapter 1: Understanding the Challenges of Data MigrationThis chapter helps the reader understand the different components of any migration of data. This includes changing the format, changing the way the data is referenced and referred to internally, and the basic mechanics of getting the data in in the first place.Chapter 2: Data Mapping and Transformations
There are two key elements to the exchange of any information between databases. One is the data structure used for the exchange, and the other is the transformation required to reach those structures. Some of these considerations are driven by the source database, and others by the target database. Moving data from RDBMS to a NoSQL database, for example, generally requires constructing documents from what might be tabular, or joined-tabular, data. The other aspect is the difference between source and target data types. Document databases and Big Data stores tend not to care about the data type, whereas RDBMS cannot live without them. In this chapter we'll examine some of the key differences and problems with transferring data that transcend the mechanics of the process, and how to deal with them effectively.Chapter 3: Moving Data for RDBMS
RDBMS systems have a unique place in the world of data; they are universally accepted and very popular (Oracle, MySQL, Microsoft SQL Server). Their tabular structure makes them look easily exchangeable. but they require careful techniques when sharing the data. In this chapter, we'll examine some of the key issues with migrating data to and from RDBMS, including:
- Exchanging table data
- Exchanging complex queries
- Preparing for two-way transfers
- Row, Statement, or Other
... mehr
Export and import is the simplest and most readily used method for sharing and exchanging data, but there is more to it than just dumping the information. You need to consider formatting and structure, and whether you want identical tables, or a complex structure exchanged. We'll examine a variety of techniques, from base export, character separated/delimited types, and structured organization like JSON. The chapter also covers choosing a file format and dealing with raw table data or joined structuresChapter 5: Sharing Data for RDBMS through Replication
Replication is the purest and simplest form of sharing data, but it is not without its limitations or problems. Most replication is designed to handle scale-out environments, not data sharing, but there are solutions and tricks that make replication a suitable alternative for exchanging data. But how do you cope with changes to the original schema once it reaches its target database. How do you make it usable and match the target environment? That's what this chapter explains. It will cover:
NoSQL databases cause problems for traditional RDBMS types because the data is not often defined or stored in the same structured format. NoSQL encompasses everything from key/value stores through to document and graph-based databases designed to find relationships and distances between data points. Replicating data into NoSQL requires knowing what you want to keep and how to structure it, and migrating out is about organizing the data to be usable and recognizable by the target database. For example, is a document one table, or multiple tables? This chapter covers the basics of the NoSQL platform and the data challenges it presents before getting at the specifics.Chapter 8: Migrating Data for NoSQL
If you are migrating, permanently, data into NoSQL, then you must determine how that information should be transferred, transformed, and ultimately used. There are different ways to do this based on whether this is a one-time move, or whether it is a temporary move to make use of special feature. Also of special note is that many NoSQL databases have very specific or very limited methods for searching and extracting the information that has been inserted; careful selection of the data as it is migrated will make the data more usable in the NoSQL store. In this chapter, we will examine some of the key considerations, different environments, and limitations of each specific NoSQL database. Chapter 9: Sharing for NoSQL
Regularly transferring or exchanging information between another database and NoSQL can be achieved in different ways, depending on your use case and environment. For example, Couchbase, CouchDB, and MongoDB all provide solutions for watching the changes to the underlying database that make sharing the data really easy. Others, like Memcached or Riak do not provide such ready access, so different techniques need to be employed. But NoSQL is rarely used as the only solution for data storage, so we can usually take advantage of the application structure to do some of the hard work for us. This chapter will examine these techniques, along with specific data flow migrations.Chapter 9: Integrating for NoSQL
NoSQL generally has some performance advantages over RDBMS solutions, and so therefore we can use that to our advantage and combine the information from the RDBMS and NoSQL environments together. For example, why not use a caching system like Memcached with a MySQL backing store? As this chapter shows, applications can handle the basics, but they can also be modified to handle a more efficient workflow for writing, storing, and executing updates. These same techniques also affect NoSQL-like databases such as object stores and large columnar stores including BigTable, Cassandra, or HBase.Chapter 11: Moving Data for Big Data Sources
Big Data sources encompass a very wide range of different known databases and stores, but many share a similar set of goals and structures. Best known is the underlying technology behind Hadoop (including HBase and Pig) and Google's BigTable, while others look more like a very large RDBMS with a SQL, such as HP's Vertica, or Hive and Impala. This chapter explains that all of these different solutions require careful handling of the data and structure to make the data usable when it is moved. You need to consider the data structure, format, and usability as the data is moved. You also need to think about how it might be used and integrated with other sources to make it usable in the typical environment. Chapter 12: Migrating Data for Big Data Sources
If you are migrating data permanently into a Big Data store, you have a wide variety of considerations. Mostly with Big data this is about how you will get the data back out again, and, more importantly, how to take advantage of the Big Data architecture to get the best out of the structure. For example, writing data into Hadoop is easy. Writing data into Hadoop so that it can be efficiently processed and distributed across the Hadoop cluster is a different matter. In this chapter, we'll examine methods for permanently moving data into Big Data stores for archival, storage, and long-term analysis needs. We'll also look at whether a simple dump/export and import is the easiest and most efficient methods, or whether there are better solutions to the direct data exchange. Chapter 13: Sharing for Big Data Sources
Sharing data for Big Data sources, for example by regularly replicating information from an existing database into a Big Data store, has specific problems. For example, Big Data encompasses both structured and unstructured storage formats. Knowing how to use these environments to your advantage, and how the data can be efficiently transferred, is critical to making the Big Data database work for you and not against you. In this chapter we'll examine specific tricks, such as using specialist tools like Sqoop, specialist replication tools like Oracle GoldenGate or Tungsten Replicator, and also tools and methods for handling incremental and staged data both in and out of Big Data sources. Chapter 14: Integrating for Big Data Sources
Big Data sources such as Hadoop are no longer distant silos at the end of an existing data chain. Frequently, Hadoop is being brought up to the same architectural level as the RDBMS stores that used to be the source for their data. In this chapter, we're going to look at the most effective ways to integrate a Big Data store into your applications and database needs. This will enable you to use Big Data both as a data store, and to process short- and long-term information by leveraging the data transformations we have already used to make the Big Data compatible with solutions such as Spark and MillWheel. These will enable data to be readily swapped both to and from Big Data sources into a cohesive part of the heterogeneous environment.
Export and import is the simplest and most readily used method for sharing and exchanging data, but there is more to it than just dumping the information. You need to consider formatting and structure, and whether you want identical tables, or a complex structure exchanged. We'll examine a variety of techniques, from base export, character separated/delimited types, and structured organization like JSON. The chapter also covers choosing a file format and dealing with raw table data or joined structuresChapter 5: Sharing Data for RDBMS through Replication
Replication is the purest and simplest form of sharing data, but it is not without its limitations or problems. Most replication is designed to handle scale-out environments, not data sharing, but there are solutions and tricks that make replication a suitable alternative for exchanging data. But how do you cope with changes to the original schema once it reaches its target database. How do you make it usable and match the target environment? That's what this chapter explains. It will cover:
- Replicating between RDBMS
- Replicating out of RDBMS
- Replicating in to RDBMS
NoSQL databases cause problems for traditional RDBMS types because the data is not often defined or stored in the same structured format. NoSQL encompasses everything from key/value stores through to document and graph-based databases designed to find relationships and distances between data points. Replicating data into NoSQL requires knowing what you want to keep and how to structure it, and migrating out is about organizing the data to be usable and recognizable by the target database. For example, is a document one table, or multiple tables? This chapter covers the basics of the NoSQL platform and the data challenges it presents before getting at the specifics.Chapter 8: Migrating Data for NoSQL
If you are migrating, permanently, data into NoSQL, then you must determine how that information should be transferred, transformed, and ultimately used. There are different ways to do this based on whether this is a one-time move, or whether it is a temporary move to make use of special feature. Also of special note is that many NoSQL databases have very specific or very limited methods for searching and extracting the information that has been inserted; careful selection of the data as it is migrated will make the data more usable in the NoSQL store. In this chapter, we will examine some of the key considerations, different environments, and limitations of each specific NoSQL database. Chapter 9: Sharing for NoSQL
Regularly transferring or exchanging information between another database and NoSQL can be achieved in different ways, depending on your use case and environment. For example, Couchbase, CouchDB, and MongoDB all provide solutions for watching the changes to the underlying database that make sharing the data really easy. Others, like Memcached or Riak do not provide such ready access, so different techniques need to be employed. But NoSQL is rarely used as the only solution for data storage, so we can usually take advantage of the application structure to do some of the hard work for us. This chapter will examine these techniques, along with specific data flow migrations.Chapter 9: Integrating for NoSQL
NoSQL generally has some performance advantages over RDBMS solutions, and so therefore we can use that to our advantage and combine the information from the RDBMS and NoSQL environments together. For example, why not use a caching system like Memcached with a MySQL backing store? As this chapter shows, applications can handle the basics, but they can also be modified to handle a more efficient workflow for writing, storing, and executing updates. These same techniques also affect NoSQL-like databases such as object stores and large columnar stores including BigTable, Cassandra, or HBase.Chapter 11: Moving Data for Big Data Sources
Big Data sources encompass a very wide range of different known databases and stores, but many share a similar set of goals and structures. Best known is the underlying technology behind Hadoop (including HBase and Pig) and Google's BigTable, while others look more like a very large RDBMS with a SQL, such as HP's Vertica, or Hive and Impala. This chapter explains that all of these different solutions require careful handling of the data and structure to make the data usable when it is moved. You need to consider the data structure, format, and usability as the data is moved. You also need to think about how it might be used and integrated with other sources to make it usable in the typical environment. Chapter 12: Migrating Data for Big Data Sources
If you are migrating data permanently into a Big Data store, you have a wide variety of considerations. Mostly with Big data this is about how you will get the data back out again, and, more importantly, how to take advantage of the Big Data architecture to get the best out of the structure. For example, writing data into Hadoop is easy. Writing data into Hadoop so that it can be efficiently processed and distributed across the Hadoop cluster is a different matter. In this chapter, we'll examine methods for permanently moving data into Big Data stores for archival, storage, and long-term analysis needs. We'll also look at whether a simple dump/export and import is the easiest and most efficient methods, or whether there are better solutions to the direct data exchange. Chapter 13: Sharing for Big Data Sources
Sharing data for Big Data sources, for example by regularly replicating information from an existing database into a Big Data store, has specific problems. For example, Big Data encompasses both structured and unstructured storage formats. Knowing how to use these environments to your advantage, and how the data can be efficiently transferred, is critical to making the Big Data database work for you and not against you. In this chapter we'll examine specific tricks, such as using specialist tools like Sqoop, specialist replication tools like Oracle GoldenGate or Tungsten Replicator, and also tools and methods for handling incremental and staged data both in and out of Big Data sources. Chapter 14: Integrating for Big Data Sources
Big Data sources such as Hadoop are no longer distant silos at the end of an existing data chain. Frequently, Hadoop is being brought up to the same architectural level as the RDBMS stores that used to be the source for their data. In this chapter, we're going to look at the most effective ways to integrate a Big Data store into your applications and database needs. This will enable you to use Big Data both as a data store, and to process short- and long-term information by leveraging the data transformations we have already used to make the Big Data compatible with solutions such as Spark and MillWheel. These will enable data to be readily swapped both to and from Big Data sources into a cohesive part of the heterogeneous environment.
... weniger
Autoren-Porträt von Martin Brown
A professional writer for over 15 years, Martin (MC) Brown is the author and contributor to more than 26 books covering an array of topics, including the recently published Getting Started with CouchDB. His expertise spans myriad development languages and platforms: Perl, Python, Java, JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows, Solaris, Linux, BeOS, Microsoft WP, Mac OS and more. He currently works as the director of documentation for Continuent.
Bibliographische Angaben
- Autor: Martin Brown
- 2016, 1st ed., 350 Seiten, Maße: 19,1 x 23,5 cm, Kartoniert (TB), Englisch
- Verlag: APress
- ISBN-10: 1484201973
- ISBN-13: 9781484201978
Sprache:
Englisch
Kommentar zu "Moving Data"
0 Gebrauchte Artikel zu „Moving Data“
Zustand | Preis | Porto | Zahlung | Verkäufer | Rating |
---|
Schreiben Sie einen Kommentar zu "Moving Data".
Kommentar verfassen