Skip to content

Commit

Permalink
Document data encoding / decoding in inference.md (#884)
Browse files Browse the repository at this point in the history
Custom endpoints need to encode and decode the data passed to the callback functions. 

This is a little unclear in the current documentation and one may assumes the data has already `dict` format (like I did). Thus, I updated the examples to make the encoding and decoding of the JSON string arguments explicit.
  • Loading branch information
timsteuer authored Nov 23, 2023
1 parent ccaa63c commit 09dd417
Showing 1 changed file with 32 additions and 7 deletions.
39 changes: 32 additions & 7 deletions docs/sagemaker/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,25 +346,50 @@ The `inference.py` file contains your custom inference module, and the `requirem
Here is an example of a custom inference module with `model_fn`, `input_fn`, `predict_fn`, and `output_fn`:

```python
from sagemaker_huggingface_inference_toolkit import decoder_encoder

def model_fn(model_dir):
return "model"
# implement custom code to load the model
loaded_model = ...

return loaded_model

def input_fn(data, content_type):
return "data"
def input_fn(input_data, content_type):
# decode the input data (e.g. JSON string -> dict)
data = decoder_encoder.decode(input_data, content_type)
return data

def predict_fn(data, model):
return "output"
# call your custom model with the data
outputs = model(data , ... )
return predictions

def output_fn(prediction, accept):
return prediction
# convert the model output to the desired output format (e.g. dict -> JSON string)
response = decoder_encoder.encode(prediction, accept)
return response
```

Customize your inference module with only `model_fn` and `transform_fn`:

```python
from sagemaker_huggingface_inference_toolkit import decoder_encoder

def model_fn(model_dir):
return "loading model"
# implement custom code to load the model
loaded_model = ...

return loaded_model

def transform_fn(model, input_data, content_type, accept):
return f"output"
# decode the input data (e.g. JSON string -> dict)
data = decoder_encoder.decode(input_data, content_type)

# call your custom model with the data
outputs = model(data , ... )

# convert the model output to the desired output format (e.g. dict -> JSON string)
response = decoder_encoder.encode(output, accept)

return response
```

0 comments on commit 09dd417

Please sign in to comment.