Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Heap-Use-After-Free (UAF) in tv-casting-app when deleting a fabric after casting #36920

Open
BoB13-Matter opened this issue Dec 20, 2024 · 8 comments · May be fixed by #37199
Open

[BUG] Heap-Use-After-Free (UAF) in tv-casting-app when deleting a fabric after casting #36920

BoB13-Matter opened this issue Dec 20, 2024 · 8 comments · May be fixed by #37199
Assignees
Labels

Comments

@BoB13-Matter
Copy link
Contributor

Reproduction steps

  1. Build the tv-casting-app with AddressSanitizer (ASAN) enabled.

  2. Open Terminal 1 and start the tv-casting-app:

    $ ./tv-casting-app
  3. Open Terminal 2 and start the tv-app:

    $ ./tv-app
  4. In the tv-casting-app shell, send a cast request:

    tv-casting-app > cast request 0
  5. In the tv-app shell, respond with:

    tv-app > controller ux ok
  6. In the tv-casting-app, print the list of fabrics and delete a specific fabric:

    tv-casting-app > cast print-fabrics
    tv-casting-app > cast delete-fabric <fabricId>
  7. Observe the ASAN output in tv-casting-app indicating a heap-use-after-free issue.
    tv-casting-app UAF.txt

Summary

A heap-use-after-free (UAF) issue occurs in the tv-casting-app when the delete-fabric command is executed. After removing a fabric, the system attempts to access a freed ReadClient object.

Analysis and Description

The issue arises from a lifecycle mismatch between fabric cleanup and the ReadClient object references. The sequence of events leading to the issue is as follows:

  1. Fabric Removal (FabricTable.cpp):

    • The Delete() method of FabricTable is called, which iterates through its delegates and invokes OnFabricRemoved():

      CHIP_ERROR FabricTable::Delete(FabricIndex fabricIndex)
      {
          if (mDelegateListRoot != nullptr)
          {
              FabricTable::Delegate * delegate = mDelegateListRoot;
              while (delegate)
              {
                  FabricTable::Delegate * nextDelegate = delegate->next;
                  delegate->OnFabricRemoved(*this, fabricIndex);
                  delegate = nextDelegate;
              }
          }
      }
      
  2. ReadClient Cleanup (InteractionModelEngine.cpp):

    • The OnFabricRemoved() method loops through active ReadClient objects and calls Close():

      void InteractionModelEngine::OnFabricRemoved(const FabricTable & fabricTable, FabricIndex fabricIndex)
      {
          for (auto * readClient = mpActiveReadClientList; readClient != nullptr; readClient = readClient->GetNextClient())
          {
              if (readClient->GetFabricIndex() == fabricIndex)
              {
                  readClient->Close(CHIP_ERROR_IM_FABRIC_DELETED, false);
              }
          }
      }
      
  3. ReadClient Deallocation (ReadClient.cpp):

    • The Close() method triggers the callback mpCallback.OnDone(this), marking the ReadClient for destruction:

      void ReadClient::Close(CHIP_ERROR aError, bool allowResubscription)
      {
          ...
          mpCallback.OnDone(this);
      }
      
  4. Use of Freed Memory:

    • Despite being destroyed, the ReadClient remains in the mpActiveReadClientList. Subsequent iterations in OnFabricRemoved() access the dangling pointer. Specifically, the readClient->GetNextClient() call accesses the mpNext pointer of the destroyed ReadClient object, leading to a UAF:

      ReadClient * GetNextClient() { return mpNext; }
    • This happens because GetNextClient() does not verify the validity of the mpNext pointer, which now points to a deallocated memory region.

Bug prevalence

always

GitHub hash of the SDK that was being used

04e6a68

Platform

core

Platform Version(s)

all versions

Anything else?

No response

@bzbarsky-apple
Copy link
Contributor

@BoB13-Matter This would be an issue for any fabric removal with a live ReadClient, right? Not specific to tv-casting?

@yunhanw-google yunhanw-google self-assigned this Jan 3, 2025
@yunhanw-google
Copy link
Contributor

yunhanw-google commented Jan 8, 2025

Just looking at this issue, application would take care of the delete for readClient inside ReadClient::OnDone, in the destructor of readClient, we will take care of the object removal in mpActiveReadClientList via mpImEngine->RemoveReadClient(this); this issue should be fine?

maybe tv-casting-app does not delete readClient object in ReadClient::onDone?

ReadClient::~ReadClient()
{
...
if (mpImEngine)
{
mpImEngine->RemoveReadClient(this);
}
}
}

one improvement in readClient we can do is to reset mPeer in readClient::Close so that the below GetFabricIndex would get the wrong value.
if (readClient->GetFabricIndex() == fabricIndex)
{
readClient->Close(CHIP_ERROR_IM_FABRIC_DELETED, false);
}

@bzbarsky-apple
Copy link
Contributor

The problem is we are holding a reference to the now-deleted thing in the loop in InteractionModelEngine::OnFabricRemoved, no? @yunhanw-google

@yunhanw-google
Copy link
Contributor

yunhanw-google commented Jan 19, 2025

@bzbarsky-apple
you mean that we hold the reference for now deleted readClient in the below loop code in InteractionModelEngine::OnFabricRemoved ?

    for (auto * readClient = mpActiveReadClientList; readClient != nullptr; readClient = readClient->GetNextClient())
    {
        if (readClient->GetFabricIndex() == fabricIndex)
        {
            ChipLogProgress(InteractionModel, "Fabric removed, deleting obsolete read client with FabricIndex: %u", fabricIndex);
            readClient->Close(CHIP_ERROR_IM_FABRIC_DELETED, false);
        }
    }

I think mpImEngine->RemoveReadClient(this) in ~ReadClient() has removed that readClient reference?

void InteractionModelEngine::RemoveReadClient(ReadClient * apReadClient)
{
    ReadClient * pPrevListItem = nullptr;
    ReadClient * pCurListItem  = mpActiveReadClientList;

    while (pCurListItem != apReadClient)
    {
        //
        // Item must exist in this tracker list. If not, there's a bug somewhere.
        //
        VerifyOrDie(pCurListItem != nullptr);

        pPrevListItem = pCurListItem;
        pCurListItem  = pCurListItem->GetNextClient();
    }

    if (pPrevListItem)
    {
        pPrevListItem->SetNextClient(apReadClient->GetNextClient());
    }
    else
    {
        mpActiveReadClientList = apReadClient->GetNextClient();
    }

    apReadClient->SetNextClient(nullptr);
}

@bzbarsky-apple
Copy link
Contributor

you mean that we hold the reference for now deleted readClient in the below loop code

Yes.

I think mpImEngine->RemoveReadClient(this) in ~ReadClient() has removed that readClient reference?

How could it? The reference is held on the stack in the readClient variable.

@andy31415
Copy link
Contributor

So the stack trace shows that Close eventually calls delete:

    #1 0x5dc1137ca60d in chip::Platform::MemoryFree(void*) /home/kcw/Matter/connectedhomeip/out/linux-x64-tv-casting-app-asan-ubsan-clang/../../examples/tv-casting-app/linux/third_party/connectedhomeip/src/lib/support/CHIPMem-Malloc.cpp:116:5
...
   #17 0x5dc11438fa8d in chip::app::ReadClient::Close(chip::ChipError, bool) /home/kcw/Matter/connectedhomeip/out/linux-x64-tv-casting-app-asan-ubsan-clang/../../examples/tv-casting-app/linux/third_party/connectedhomeip/src/app/ReadClient.cpp:235:16
    #18 0x5dc114292c19 in chip::app::InteractionModelEngine::OnFabricRemoved(chip::FabricTable const&, unsigned char) /home/kcw/Matter/connectedhomeip/out/linux-x64-tv-casting-app-asan-ubsan-clang/../../examples/tv-casting-app/linux/third_party/connectedhomeip/src/app/InteractionModelEngine.cpp:1943:25

And the loop is

    for (auto * readClient = mpActiveReadClientList; readClient != nullptr; readClient = readClient->GetNextClient())
    {
        if (readClient->GetFabricIndex() == fabricIndex)
        {
            ChipLogProgress(InteractionModel, "Fabric removed, deleting obsolete read client with FabricIndex: %u", fabricIndex);
            readClient->Close(CHIP_ERROR_IM_FABRIC_DELETED, false);
        }
    }

So we are deleting readClient and then trying to do readClient->GetNextClient(). This looks buggy for sure.

@yunhanw-google
Copy link
Contributor

Agree, the loop variable is pointing to deleted memory.... And the body of the for will delete the object readClient points to. So then readClient->GetNextClient() will be use-after-free.

@yunhanw-google
Copy link
Contributor

Just create the PR for the fix and corresponding unit test to cover, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.

4 participants