Member-only story

Using Amazon SageMaker Debugger on DeepRacer stack — Part 1

5 min readJul 10, 2020

My initial goal is to try out Amazon SageMaker Debugger and see if I can get some useful information apart from what DeepRacer stack provides.

However, after many trial and errors, I found that it’s not as easy as AWS’s sample codes show. Though, I think my journey would still be a good example to show how to make SageMaker Debugger works in customised environments.

What is Amazon SageMaker Debugger

Amazon SageMaker Debugger is a tool for debugging ML training. It helps us do many heavy lifting, like collecting data, monitoring training process, detecting abnormal behaviour, etc.

How does Amazon SageMaker Debugger work?

Amazon SageMaker Debugger consists of 2 parts: Collections/Hooks and Rules.

Collections/Hooks

Collections are groups of artifacts (a.k.a. tensors) generated by the training. It can be the tensors storing model losses, weights, etc.

In order to do debugging on the models, we need to retrieve those artifacts. Thus, SageMaker Debugger uses hooks to emit the tensors from SageMaker container to other storage (most commonly, S3).

Rules

Besides the training job itself, SageMaker will spin up another process job if you choose to use SageMaker debugger for that training.

Using Amazon SageMaker Debugger on DeepRacer stack — Part 1

What is Amazon SageMaker Debugger

How does Amazon SageMaker Debugger work?

Collections/Hooks

Rules

Written by Richard Fan

No responses yet