Bug: payload validation is skipped when item is returned via `ReturnValuesOnConditionCheckFailure` #3781

dreamorosi · 2024-02-16T16:48:32Z

Expected Behaviour

When making a function idempotent via the @idempotent decorator and enabling payload validation the Idempotency utility should raise a payload validation exception when the payload in subsequent requests differs from the one stored.

With this configuration:

persistence_layer = DynamoDBPersistenceLayer(table_name=os.environ["TABLE_NAME"])
config = IdempotencyConfig(
    event_key_jmespath='["rcn", "subscriptionType"]',
    payload_validation_jmespath="customerAccounts",
)

we are using two fields from the payload to generate the idempotency hash (rcn and subscriptionType) but we are using a different key (customerAccounts) for payload validation.

If we send the following two requests:

{
  "rcn": "RCN-000-000-000-000",
  "customerAccounts": {
    "customerAccount": [
      {
        "accountId": "12345",
        "accountType": "someaccounttype"
      }
    ]
  },
  "subscriptionType": "somesubscriptiontype"
}

and

{
  "rcn": "RCN-000-000-000-000",
  "customerAccounts": {
    "customerAccount": [
      {
        "accountId": "54321",
        "accountType": "someaccounttype"
      }
    ]
  },
  "subscriptionType": "somesubscriptiontype"
}

These two requests should be considered idempotent because the compound value of rcn and subscriptionType is the same across requests.

At the same time, the values under customerAccounts have changed between requests (note customerAccounts.customerAccount[0].accountId being different). This means the payload validation should fail in the second request.

Current Behaviour

The function does not throw a validation error even though payload validation is enabled and parts of the payload differ from the stored one.

See screenshot below, where I send both requests and both are successful:

Code snippet

import os
from aws_lambda_powertools.utilities.idempotency import (
    DynamoDBPersistenceLayer,
    IdempotencyConfig,
    idempotent,
)
from aws_lambda_powertools.utilities.typing import LambdaContext

persistence_layer = DynamoDBPersistenceLayer(table_name=os.environ["TABLE_NAME"])
config = IdempotencyConfig(
    event_key_jmespath='["rcn", "subscriptionType"]',
    payload_validation_jmespath="customerAccounts",
)

@idempotent(persistence_store=persistence_layer, config=config)
def handler(event: dict, context: LambdaContext):
    return True

Possible Solution

As part of a feature added in #3446 we are using DynamoDB ReturnValuesOnConditionCheckFailure whenever available. This feature allows the DynamoDB to return an item that caused a conditional operation to fail, which in turn allows the Idempotency utility to get the IdempotencyRecord of a request without having to do an additional read.

In the base idempotency class (here), we have added an if/else statement that uses the returned item when available, or otherwise it calls the _get_idemptoency_record() method when one is not available - this last case can happen when customers are using older versions of boto3.

With the current implementation, the logic that validates the payload is executed only in the else clause, which causes payload validation to be skipped.

A potential fix would be to make the _validate_payload() method in the BasePersistenceLayer class and then call it on the exc.old_data_record, similar to this:

record = exc.old_data_record
if exc.old_data_record is not None:
  self.persistence_store.validate_payload(
    self.data,
    record
  )
  else:
    record = self._get_idempotency_record()

Steps to Reproduce

See steps above.

Powertools for AWS Lambda (Python) version

latest

AWS Lambda function runtime

3.12

Packaging format used

PyPi

Debugging logs

No response

The text was updated successfully, but these errors were encountered:

dreamorosi · 2024-02-16T16:52:20Z

I have tried implementing the fix I was suggesting in the OP but I wasn't able to make the tests pass :/

rubenfonseca · 2024-02-19T13:59:54Z

Thank you for opening this detailed issue Andrea! I'll try to finish the PR

rubenfonseca · 2024-02-20T09:20:29Z

So acording to

powertools-lambda-python/aws_lambda_powertools/utilities/idempotency/persistence/dynamodb.py

Lines 251 to 267 in c135cec

    
           except ClientError as exc: 
        
               error_code = exc.response.get("Error", {}).get("Code") 
        
               if error_code == "ConditionalCheckFailedException": 
        
                   old_data_record = self._item_to_data_record(exc.response["Item"]) if "Item" in exc.response else None 
        
                   if old_data_record is not None: 
        
                       logger.debug( 
        
                           f"Failed to put record for already existing idempotency key: " 
        
                           f"{data_record.idempotency_key} with status: {old_data_record.status}, " 
        
                           f"expiry_timestamp: {old_data_record.expiry_timestamp}, " 
        
                           f"and in_progress_expiry_timestamp: {old_data_record.in_progress_expiry_timestamp}", 
        
                       ) 
        
                       self._save_to_cache(data_record=old_data_record) 
        
                       try: 
        
                           self._validate_payload(data_payload=data_record, stored_data_record=old_data_record) 
        
                       except IdempotencyValidationError as idempotency_validation_error: 
        
                           raise idempotency_validation_error from exc

we are already validating the payload when the conditional exception fails. So I need to dig deeper to understand why it isn't working. I'll start with a test.

heitorlessa · 2024-02-20T12:31:46Z

couldn't find the test for this enhancement (that potentially triggered this regression) -- looking at the history now

heitorlessa · 2024-02-20T12:54:26Z

found the test that was modified with this feature but seems to be working as expected, will look at E2E and create one if there isn't one to be absolutely sure.

Tasks

Reproduce
Create E2E for payload validation (in addition to functional to be extra safe)
Create a test for nested field validation error
Create functional test

E2E Handler for payload validation

import os
import uuid

from aws_lambda_powertools.utilities.idempotency import (
    DynamoDBPersistenceLayer,
    IdempotencyConfig,
    idempotent,
)

TABLE_NAME = os.getenv("IdempotencyTable", "")
persistence_layer = DynamoDBPersistenceLayer(table_name=TABLE_NAME)
config = IdempotencyConfig(event_key_jmespath='["refund_id", "customer_id"]', payload_validation_jmespath="amount")


@idempotent(config=config, persistence_store=persistence_layer)
def lambda_handler(event, context):
    return {"request": str(uuid.uuid4())}

E2E test for payload validation tampering

@pytest.mark.xdist_group(name="idempotency")
def test_payload_tampering_validation(payload_tampering_validation_fn_arn: str):
    # GIVEN a transaction with the idempotency key on refund and customer IDs
    transaction = {
        "refund_id": "ffd11882-d476-4598-bbf1-643f2be5addf",
        "customer_id": "9e9fc440-9e65-49b5-9e71-1382ea1b1658",
        "amount": 100,
    }

    # AND a second transaction with the exact idempotency key but different amount
    second_transaction = {**transaction, "amount": 1000}

    # WHEN we make both requests to a Lambda Function that enabled payload validation
    data_fetcher.get_lambda_response(
        lambda_arn=payload_tampering_validation_fn_arn, payload=json.dumps(transaction)
    )

    # THEN we should receive a payload validation error in the second request
    with pytest.raises(RuntimeError, match="Payload does not match stored record"):
        data_fetcher.get_lambda_response(
            lambda_arn=payload_tampering_validation_fn_arn,
            payload=json.dumps(second_transaction),
        )

heitorlessa · 2024-02-20T14:46:09Z

okay managed to reproduce - validation at nested fields aren't working, that's why tests worked but @dreamorosi payload didn't. Docs also use a single key/value, compound would also work, but nested fields didn't.

NOTE. This is not related to ReturnValueOnConditionCheckFailure. Our tests caught the change before, and that's why @rubenfonseca wasn't able to validate.

I've created a functional test along w/ some basic tooling (we sorely need testing tooling) to make it easier to identify the root cause. I suspect it might be down to ordering as hashing it's occurring.

def test_idempotent_lambda_already_completed_with_validation_bad_nested_payload(
    persistence_store: DynamoDBPersistenceLayer,
    timestamp_future,
    lambda_context,
    request: FixtureRequest,
):
    # GIVEN an idempotency config with a compound idempotency key (refund, customer_id)
    # AND with payload validation key to prevent tampering

    validation_key = "details"
    idempotency_config = IdempotencyConfig(
        event_key_jmespath='["refund_id", "customer_id"]', payload_validation_jmespath=validation_key
    )

    # AND a previous transaction already processed in the persistent store
    transaction = {
        "refund_id": "ffd11882-d476-4598-bbf1-643f2be5addf",
        "customer_id": "9e9fc440-9e65-49b5-9e71-1382ea1b1658",
        "details": {"company_name": "Parker, Johnson and Rath", "currency": "Turkish Lira"},
    }

    stubber = stub.Stubber(persistence_store.client)
    ddb_response = build_idempotency_put_item_response_stub(
        data=transaction,
        expiration=timestamp_future,
        status="COMPLETED",
        request=request,
        validation_data=transaction[validation_key],
    )

    stubber.add_client_error("put_item", "ConditionalCheckFailedException", modeled_fields=ddb_response)
    stubber.activate()

    # AND an upcoming tampered transaction
    tampered_transaction = copy.deepcopy(transaction)
    tampered_transaction["details"]["currency"] = "Euro"

    @idempotent(config=idempotency_config, persistence_store=persistence_store)
    def lambda_handler(event, context):
        return event

    # WHEN the tampered request is made
    # THEN we should raise
    with pytest.raises(IdempotencyValidationError):
        lambda_handler(tampered_transaction, lambda_context)

    stubber.assert_no_pending_responses()
    stubber.deactivate()

heitorlessa · 2024-02-20T15:53:01Z

Turns out I only managed to reproduce because of fat finger ... I used the wrong validation key after editing the wrong file (same name but local cache for E2E).

I can't reproduce with both functional and E2E tests. I'm gonna have another look at the original snippet see if I can spot anything, and push a PR to improve our tests regardless.

That said, TS has another issue I thought it affected us... ordering. We deep sort payloads both incoming and for validation before generating a hash.

Phew!

heitorlessa · 2024-02-20T16:17:57Z

Waiting for E2E to run with this new case (functional tests are passing) -- I can't for the love of me reproduce it. Added exact same nesting and change in functional and E2E test above in the PR -- I'll try one last time with a manual Lambda tomorrow.

https://github.com/aws-powertools/powertools-lambda-python/actions/runs/7976558714

dreamorosi · 2024-02-20T17:37:50Z

I'll try one last time with a manual Lambda tomorrow.

No need.

I have deployed the manual stack from scratch & ran the two payloads again and the second request is throwing as expected.

Given this I can only assume that I must have made a mistake. Apologies for making you run in circles and wasting your time @rubenfonseca & @heitorlessa.

I'll be more thorough in the future.

heitorlessa · 2024-02-21T07:59:46Z

not at all. First, better safe than sorry. Second, this was an useful exercise to improve our test coverage on payload validation as it was overly simplistic :)

If anything, keep it coming @dreamorosi. It was also useful exercise to discover the ordering issue in TS nevertheless.

In case I haven't said it this month... it's a pleasure working with you :)

github-actions · 2024-02-21T08:00:15Z

⚠️COMMENT VISIBILITY WARNING⚠️

This issue is now closed. Please be mindful that future comments are hard for our team to see.

If you need more assistance, please either tag a team member or open a new issue that references this one.

If you wish to keep having a conversation with other community members under this issue feel free to do so.

dreamorosi added bug Something isn't working triage Pending triage from maintainers labels Feb 16, 2024

github-project-automation bot added this to Powertools for AWS Lambda (Python) Feb 16, 2024

github-project-automation bot moved this to Triage in Powertools for AWS Lambda (Python) Feb 16, 2024

dreamorosi added the idempotency Idempotency utility label Feb 16, 2024

dreamorosi mentioned this issue Feb 16, 2024

fix(idempotency): validate idempotency record returned in conditional write aws-powertools/powertools-lambda-typescript#2083

Merged

9 tasks

rubenfonseca removed the triage Pending triage from maintainers label Feb 19, 2024

rubenfonseca moved this from Triage to Working on it in Powertools for AWS Lambda (Python) Feb 19, 2024

rubenfonseca self-assigned this Feb 19, 2024

heitorlessa added cant-reproduce Any issues that cannot be reproduced and removed bug Something isn't working labels Feb 20, 2024

heitorlessa mentioned this issue Feb 20, 2024

chore(tests): increase idempotency coverage with nested payload tampering tests #3809

Merged

7 tasks

github-actions bot added the pending-release Fix or implementation already in dev waiting to be released label Feb 20, 2024

heitorlessa closed this as completed Feb 21, 2024

github-project-automation bot moved this from Working on it to Coming soon in Powertools for AWS Lambda (Python) Feb 21, 2024

heitorlessa moved this from Coming soon to Shipped in Powertools for AWS Lambda (Python) Feb 21, 2024

heitorlessa removed the pending-release Fix or implementation already in dev waiting to be released label Feb 21, 2024

dreamorosi mentioned this issue Feb 21, 2024

Feature request: deep sort payload hashing aws-powertools/powertools-lambda-typescript#2120

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: payload validation is skipped when item is returned via `ReturnValuesOnConditionCheckFailure` #3781

Bug: payload validation is skipped when item is returned via `ReturnValuesOnConditionCheckFailure` #3781

dreamorosi commented Feb 16, 2024 •

edited

Loading

dreamorosi commented Feb 16, 2024

rubenfonseca commented Feb 19, 2024

rubenfonseca commented Feb 20, 2024

heitorlessa commented Feb 20, 2024

heitorlessa commented Feb 20, 2024 •

edited

Loading

heitorlessa commented Feb 20, 2024 •

edited

Loading

heitorlessa commented Feb 20, 2024

heitorlessa commented Feb 20, 2024

dreamorosi commented Feb 20, 2024

heitorlessa commented Feb 21, 2024

github-actions bot commented Feb 21, 2024

Bug: payload validation is skipped when item is returned via ReturnValuesOnConditionCheckFailure #3781

Bug: payload validation is skipped when item is returned via ReturnValuesOnConditionCheckFailure #3781

Comments

dreamorosi commented Feb 16, 2024 • edited Loading

Expected Behaviour

Current Behaviour

Code snippet

Possible Solution

Steps to Reproduce

Powertools for AWS Lambda (Python) version

AWS Lambda function runtime

Packaging format used

Debugging logs

dreamorosi commented Feb 16, 2024

rubenfonseca commented Feb 19, 2024

rubenfonseca commented Feb 20, 2024

heitorlessa commented Feb 20, 2024

heitorlessa commented Feb 20, 2024 • edited Loading

heitorlessa commented Feb 20, 2024 • edited Loading

heitorlessa commented Feb 20, 2024

heitorlessa commented Feb 20, 2024

dreamorosi commented Feb 20, 2024

heitorlessa commented Feb 21, 2024

github-actions bot commented Feb 21, 2024

⚠️COMMENT VISIBILITY WARNING⚠️

Bug: payload validation is skipped when item is returned via `ReturnValuesOnConditionCheckFailure` #3781

Bug: payload validation is skipped when item is returned via `ReturnValuesOnConditionCheckFailure` #3781

dreamorosi commented Feb 16, 2024 •

edited

Loading

heitorlessa commented Feb 20, 2024 •

edited

Loading

heitorlessa commented Feb 20, 2024 •

edited

Loading