Library Hours
Monday to Friday: 9 a.m. to 9 p.m.
Saturday: 9 a.m. to 5 p.m.
Sunday: 1 p.m. to 9 p.m.
Naper Blvd. 1 p.m. to 5 p.m.
     
Limit search to available items
Results Page:  Previous Next

Title Data lakes / edited by Anne Laurent, Dominique Laurent, Cédrine Madera. [O'Reilly electronic resources]

Imprint London : ISTE, Ltd. ; Hoboken : Wiley, 2020.
QR Code
Description 1 online resource (249 pages)
Series Computer engineering series, databases and big data set ; volume 2
Computer engineering series. Databases and big data set ; volume 2.
Contents Cover -- Half-Title Page -- Dedication -- Title Page -- Copyright Page -- Contents -- Preface -- 1. Introduction to Data Lakes: Definitions and Discussions -- 1.1. Introduction to data lakes -- 1.2. Literature review and discussion -- 1.3. The data lake challenges -- 1.4. Data lakes versus decision-making systems -- 1.5. Urbanization for data lakes -- 1.6. Data lake functionalities -- 1.7. Summary and concluding remarks -- 2. Architecture of Data Lakes -- 2.1. Introduction -- 2.2. State of the art and practice -- 2.2.1. Definition -- 2.2.2. Architecture -- 2.2.3. Metadata
2.2.4. Data quality -- 2.2.5. Schema-on-read -- 2.3. System architecture -- 2.3.1. Ingestion layer -- 2.3.2. Storage layer -- 2.3.3. Transformation layer -- 2.3.4. Interaction layer -- 2.4. Use case: the Constance system -- 2.4.1. System overview -- 2.4.2. Ingestion layer -- 2.4.3. Maintenance layer -- 2.4.4. Query layer -- 2.4.5. Data quality control -- 2.4.6. Extensibility and flexibility -- 2.5. Concluding remarks -- 3. Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures -- 3.1. Our expectations -- 3.2. Modeling data lake functionalities
3.3. Building the knowledge base of industrial data lakes -- 3.4. Our formalization approach -- 3.5. Applying our approach -- 3.6. Analysis of our first results -- 3.7. Concluding remarks -- 4. Metadata in Data Lake Ecosystems -- 4.1. Definitions and concepts -- 4.2. Classification of metadata by NISO -- 4.2.1. Metadata schema -- 4.2.2. Knowledge base and catalog -- 4.3. Other categories of metadata -- 4.3.1. Business metadata -- 4.3.2. Navigational integration -- 4.3.3. Operational metadata -- 4.4. Sources of metadata -- 4.5. Metadata classification -- 4.6. Why metadata are needed
4.6.1. Selection of information (re)sources -- 4.6.2. Organization of information resources -- 4.6.3. Interoperability and integration -- 4.6.4. Unique digital identification -- 4.6.5. Data archiving and preservation -- 4.7. Business value of metadata -- 4.8. Metadata architecture -- 4.8.1. Architecture scenario 1: point-to-point metadata architecture -- 4.8.2. Architecture scenario 2: hub and spoke metadata architecture -- 4.8.3. Architecture scenario 3: tool of record metadata architecture -- 4.8.4. Architecture scenario 4: hybrid metadata architecture
4.8.5. Architecture scenario 5: federated metadata architecture -- 4.9. Metadata management -- 4.10. Metadata and data lakes -- 4.10.1. Application and workload layer -- 4.10.2. Data layer -- 4.10.3. System layer -- 4.10.4. Metadata types -- 4.11. Metadata management in data lakes -- 4.11.1. Metadata directory -- 4.11.2. Metadata storage -- 4.11.3. Metadata discovery -- 4.11.4. Metadata lineage -- 4.11.5. Metadata querying -- 4.11.6. Data source selection -- 4.12. Metadata and master data management -- 4.13. Conclusion -- 5. A Use Case of Data Lake Metadata Management -- 5.1. Context
Note 5.1.1. Data lake definition
Bibliography Includes bibliographical references and index.
Summary The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is still a challenge, as no consensus has been reached so far. Data Lakes presents recent outcomes and trends in the field of data repositories. The main topics discussed are the data-driven architecture of a data lake; the management of metadata - supplying key information about the stored data, master data and reference data; the roles of linked data and fog computing in a data lake ecosystem; and how gravity principles apply in the context of data lakes. A variety of case studies are also presented, thus providing the reader with practical examples of data lake management.
Subject Big data.
Databases.
Données volumineuses.
Big data
Databases
Added Author Laurent, Anne, 1976-
Laurent, Dominique.
Madera, Cédrine.
Other Form: Print version: Laurent, Anne. Data Lakes. Newark : John Wiley & Sons, Incorporated, ©2020 9781786305855
ISBN 9781119720430 (electronic bk. ; oBook)
1119720435 (electronic bk. ; oBook)
1119720427
9781119720423 (electronic bk.)
Patron reviews: add a review
Click for more information
EBOOK
No one has rated this material

You can...
Also...
- Find similar reads
- Add a review
- Sign-up for Newsletter
- Suggest a purchase
- Can't find what you want?
More Information