Search by: Keyword | Title | Author | Subject | Genre | Call Number | Series | Author & Title | ISBN/ISSN | Format | Advanced

Limit search to available items

Results Page: Previous Next

Author

Title Distributed machine learning with Python : accelerating model training and serving with distributed systems / Guanhua Wang. [O'Reilly electronic resource]

Imprint

Birmingham : Packt Publishing, Limited, 2022.

To Access:
Available on O'Reilly for Public Libraries

Bookmark link: https://library.naperville-lib.org:444/record=b3191870~S1

QR Code

Description	1 online resource (284 pages) : color illustrations
Contents	Intro -- Title page -- Copyright and Credits -- Dedication -- Contributors -- Table of Contents -- Preface -- Section 1 -- Data Parallelism -- Chapter 1: Splitting Input Data -- Single-node training is too slow -- The mismatch between data loading bandwidth and model training bandwidth -- Single-node training time on popular datasets -- Accelerating the training process with data parallelism -- Data parallelism -- the high-level bits -- Stochastic gradient descent -- Model synchronization -- Hyperparameter tuning -- Global batch size -- Learning rate adjustment -- Model synchronization schemes
Summary	Chapter 2: Parameter Server and All-Reduce -- Technical requirements -- Parameter server architecture -- Communication bottleneck in the parameter server architecture -- Sharding the model among parameter servers -- Implementing the parameter server -- Defining model layers -- Defining the parameter server -- Defining the worker -- Passing data between the parameter server and worker -- Issues with the parameter server -- The parameter server architecture introduces a high coding complexity for practitioners -- All-Reduce architecture -- Reduce -- All-Reduce -- Ring All-Reduce.
Contents	Collective communication -- Broadcast -- Gather -- All-Gather -- Summary -- Chapter 3: Building a Data Parallel Training and Serving Pipeline -- Technical requirements -- The data parallel training pipeline in a nutshell -- Input pre-processing -- Input data partition -- Data loading -- Training -- Model synchronization -- Model update -- Single-machine multi-GPUs and multi-machine multi-GPUs -- Single-machine multi-GPU -- Multi-machine multi-GPU -- Checkpointing and fault tolerance -- Model checkpointing -- Load model checkpoints -- Model evaluation and hyperparameter tuning
	Model serving in data parallelism -- Summary -- Chapter 4: Bottlenecks and Solutions -- Communication bottlenecks in data parallel training -- Analyzing the communication workloads -- Parameter server architecture -- The All-Reduce architecture -- The inefficiency of state-of-the-art communication schemes -- Leveraging idle links and host resources -- Tree All-Reduce -- Hybrid data transfer over PCIe and NVLink -- On-device memory bottlenecks -- Recomputation and quantization -- Recomputation -- Quantization -- Summary -- Section 2 -- Model Parallelism -- Chapter 5: Splitting the Model
	Technical requirements -- Single-node training error -- out of memory -- Fine-tuning BERT on a single GPU -- Trying to pack a giant model inside one state-of-the-art GPU -- ELMo, BERT, and GPT -- Basic concepts -- RNN -- ELMo -- BERT -- GPT -- Pre-training and fine-tuning -- State-of-the-art hardware -- P100, V100, and DGX-1 -- NVLink -- A100 and DGX-2 -- NVSwitch -- Summary -- Chapter 6: Pipeline Input and Layer Split -- Vanilla model parallelism is inefficient -- Forward propagation -- Backward propagation -- GPU idle time between forward and backward propagation -- Pipeline input
Note	Pros and cons of pipeline parallelism.
Subject	Machine learning.
	Python (Computer program language)
	Apprentissage automatique.
	Python (Langage de programmation)
	Machine learning
	Python (Computer program language)
Other Form:	Print version: Wang, Guanhua. Distributed Machine Learning with Python. Birmingham : Packt Publishing, Limited, ©2022
ISBN	1801817219
	9781801817219 (electronic bk.)
	(pbk.)

Patron reviews: add a review

Click for more information

EBOOK

You can...

Add to My Lists

Save this record

Return to Browse

Clear all records

Also...

- Find similar reads
- Add a review
- Sign-up for Newsletter
- Suggest a purchase
- Can't find what you want?

More Information

© Naperville Public Library 2019.