Library Hours
Monday to Friday: 9 a.m. to 9 p.m.
Saturday: 9 a.m. to 5 p.m.
Sunday: 1 p.m. to 9 p.m.
Naper Blvd. 1 p.m. to 5 p.m.
     
Limit search to available items
Results Page:  Previous Next
Author Gates, Alan.

Title Programming Pig. [O'Reilly electronic resource]

Imprint Sebastopol : O'Reilly Media, 2011.
QR Code
Description 1 online resource (222 pages)
text file rda
Contents Table of Contents; Preface; Data Addiction; Who Should Read This Book; Conventions Used in This Book; Code Examples in This Book; Using Code Examples; Safari® Books Online; How to Contact Us; Acknowledgments; Chapter 1. Introduction; What Is Pig?; Pig on Hadoop; MapReduce's hello world; Pig Latin, a Parallel Dataflow Language; Comparing query and dataflow languages; How Pig differs from MapReduce; What Is Pig Useful For?; Pig Philosophy; Pig's History; Chapter 2. Installing and Running Pig; Downloading and Installing Pig; Downloading the Pig Package from Apache; Downloading Pig from Cloudera.
Downloading Pig Artifacts from MavenDownloading the Source; Running Pig; Running Pig Locally on Your Machine; Running Pig on Your Hadoop Cluster; Running Pig in the Cloud; Command-Line and Configuration Options; Return Codes; Chapter 3. Grunt; Entering Pig Latin Scripts in Grunt; HDFS Commands in Grunt; Controlling Pig from Grunt; Chapter 4. Pig's Data Model; Types; Scalar Types; Complex Types; Map; Tuple; Bag; Nulls; Schemas; Casts; Chapter 5. Introduction to Pig Latin; Preliminary Matters; Case Sensitivity; Comments; Input and Output; Load; Store; Dump; Relational Operations; foreach.
Expressions in foreachUDFs in foreach; Naming fields in foreach; Filter; Group; Order by; Distinct; Join; Limit; Sample; Parallel; User Defined Functions; Registering UDFs; Registering Python UDFs; define and UDFs; Calling Static Java Functions; Chapter 6. Advanced Pig Latin; Advanced Relational Operations; Advanced Features of foreach; flatten; Nested foreach; Using Different Join Implementations; Joining small to large data; Joining skewed data; Joining sorted data; cogroup; union; cross; Integrating Pig with Legacy Code and MapReduce; stream; mapreduce; Nonlinear Data Flows.
Controlling Executionset; Setting the Partitioner; Pig Latin Preprocessor; Parameter Substitution; Macros; Including Other Pig Latin Scripts; Chapter 7. Developing and Testing Pig Latin Scripts; Development Tools; Syntax Highlighting and Checking; describe; explain; illustrate; Pig Statistics; MapReduce Job Status; Debugging Tips; Testing Your Scripts with PigUnit; Chapter 8. Making Pig Fly; Writing Your Scripts to Perform Well; Filter Early and Often; Project Early and Often; Set Up Your Joins Properly; Use Multiquery When Possible; Choose the Right Data Type.
Select the Right Level of ParallelismWriting Your UDF to Perform; Tune Pig and Hadoop for Your Job; Using Compression in Intermediate Results; Data Layout Optimization; Bad Record Handling; Chapter 9. Embedding Pig Latin in Python; Compile; Bind; Binding Multiple Sets of Variables; Run; Running Multiple Bindings; Utility Methods; Chapter 10. Writing Evaluation and Filter Functions; Writing an Evaluation Function in Java; Where Your UDF Will Run; Evaluation Function Basics; Interacting with Pig values; Input and Output Schemas; Error Handling and Progress Reporting.
Note Constructors and Passing Data from Frontend to Backend.
Summary This guide is an ideal learning tool and reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. With Pig, you can batch-process data without having to create a full-fledged application--making it easy for you to experiment with new datasets. Programming Pig introduces new users to Pig, and provides experienced users with comprehensive coverage on key features such as the Pig Latin scripting language, the Grunt shell, and User Defined Functions (UDFs) for extending Pig. If you need to analyze terabytes of data, this book shows you how to do it efficiently wi.
Note Includes index.
Subject Pig Latin (Computer program language)
Apache Pig (Computer file)
Apache Hadoop (Computer file)
Programming languages (Electronic computers) -- Handbooks, manuals, etc.
Programming languages (Electronic computers)
Genre Handbooks and manuals
Other Form: Print version: Gates, Alan. Programming Pig. Sebastopol : O'Reilly Media, ©2011 9781449302641
ISBN 9781449317690 (electronic bk.)
1449317693 (electronic bk.)
9781449317683 (electronic bk.)
1449317685 (electronic bk.)
9781449317881
144931788X
Patron reviews: add a review
Click for more information
EBOOK
No one has rated this material

You can...
Also...
- Find similar reads
- Add a review
- Sign-up for Newsletter
- Suggest a purchase
- Can't find what you want?
More Information