Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gin.Context data race in most trivial situation #4117

Open
TelpeNight opened this issue Dec 16, 2024 · 1 comment
Open

gin.Context data race in most trivial situation #4117

TelpeNight opened this issue Dec 16, 2024 · 1 comment

Comments

@TelpeNight
Copy link

TelpeNight commented Dec 16, 2024

Description

gin.Context implements context.Context. In my opinion, it should follows context.Context package's thread safety guaranties.

However gin doc states, that gin.Context is valid only in a request scope. And one should call Copy to avoid pitfalls.

Copy returns a copy of the current context that can be safely used outside the request's scope. This has to be used when the context has to be passed to a goroutine. 

But consider following example.

gn.GET("/", func(ctx *gin.Context) {
      req, err := http.NewRequestWithContext(ctx, http.MethodGet, remote.URL, nil)
      if err != nil {
	      ctx.AbortWithStatus(500)
	      return
      }
      resp, err := http.DefaultClient.Do(req)
      if err != nil {
	      ctx.AbortWithStatus(500)
	      return
      }
      discardResp(resp)
      ctx.String(http.StatusOK, "ok")
})

This should work according to the documentation and general expectations. However, with ContextWithFallback we would get following result on context cancellation:

WARNING: DATA RACE
Write at 0x00c00026c020 by goroutine 306:
  github.com/gin-gonic/gin.(*Engine).ServeHTTP()
      /Users/nick/go/pkg/mod/github.com/gin-gonic/[email protected]/gin.go:586 +0xe7
  net/http.serverHandler.ServeHTTP()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/server.go:3142 +0x2a1
  net/http.(*conn).serve()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/server.go:2044 +0x13c4
  net/http.(*Server).Serve.gowrap3()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/server.go:3290 +0x4f

Previous read at 0x00c00026c020 by goroutine 234:
  github.com/gin-gonic/gin.(*Context).hasRequestContext()
      /Users/nick/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:1220 +0x1b3
  github.com/gin-gonic/gin.(*Context).Value()
      /Users/nick/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:1263 +0x24a
  context.value()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/context/context.go:787 +0x34d
  context.(*valueCtx).Value()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/context/context.go:756 +0x8d
  net/http/httptrace.ContextClientTrace()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/httptrace/trace.go:25 +0x4a7
  net/http.(*persistConn).readLoop()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/transport.go:2152 +0x423
  net/http.(*Transport).dialConn.gowrap2()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/transport.go:1799 +0x33

Goroutine 306 (running) created at:
  net/http.(*Server).Serve()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/server.go:3290 +0x8ec
  net/http/httptest.(*Server).goServe.func1()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/httptest/server.go:310 +0xbb

Goroutine 234 (finished) created at:
  net/http.(*Transport).dialConn()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/transport.go:1799 +0x2773
  net/http.(*Transport).dialConnFor()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/transport.go:1485 +0x124
  net/http.(*Transport).queueForDial.gowrap1()
      /Users/nick/go/pkg/mod/golang.org/[email protected]/src/net/http/transport.go:1449 +0x44

I won't dive deeply into reasoning. Default http library leaks context to other gorouting outside of request scope. The main problem is that programmer shouldn't know implementation details of http lib! It should just work!

Consider other example in the code below. Most of the time we can't even call Copy in some deep nested code.

So, now the only way of safe usage of gin.Context is to call Copy on every conversion to the context.Context. But this makes the whole point of implementing context.Context useless.

But even this is not always possible in middleware/generated code world. So we are abandoning usage of gin in out tech stack.

But I hope this will help to improve gin, as it is truly good http framework.

My proposals:

  1. Either disable context pooling when ContextWithFallback is enabled.
  2. Or stop implementing context.Context and provide explicit GetContext/MakeContext methods, that will return threadsafe context. As this is the breaking change, consider it for gin/v2

How to reproduce

go test -race -bench=.

package tests

import (
	"context"
	"errors"
	"io"
	"math/rand/v2"
	"net/http"
	"net/http/httptest"
	"testing"
	"time"

	"github.com/gin-gonic/gin"
)

func BenchmarkGinDataRace(b *testing.B) {
	remote := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		time.Sleep(randDuration(100*time.Millisecond, 500*time.Millisecond))
		w.WriteHeader(http.StatusOK)
		_, _ = io.WriteString(w, "ok")
	}))
	defer remote.Close()

	makeSomeInnerWork := func(ctx context.Context) error {
		//can't copy here
		req, err := http.NewRequestWithContext(ctx, http.MethodGet, remote.URL, nil)
		if err != nil {
			return err
		}
		resp, err := http.DefaultClient.Do(req)
		if err != nil {
			return err
		}
		discardResp(resp)
		return nil
	}

	thisMaybeDeepnested := func(ctx context.Context) error {
		var value struct{}
		cxt := context.WithValue(ctx, "my", value)
		return makeSomeInnerWork(cxt)
	}

	gin.SetMode(gin.ReleaseMode)
	gn := gin.New()
	gn.ContextWithFallback = true
	gn.GET("/", func(ctx *gin.Context) {
		err := thisMaybeDeepnested(ctx)
		if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
			//ok
			err = nil
		}
		if err != nil {
			ctx.AbortWithStatus(500)
			b.Error(err)
			return
		}
		ctx.String(http.StatusOK, "ok")
	})

	gn.GET("/direct", func(ctx *gin.Context) {
		req, err := http.NewRequestWithContext(ctx, http.MethodGet, remote.URL, nil)
		if err != nil {
			ctx.AbortWithStatus(500)
			b.Error(err)
			return
		}
		resp, err := http.DefaultClient.Do(req)
		if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
			//ok
			err = nil
		}
		if err != nil {
			ctx.AbortWithStatus(500)
			b.Error(err)
			return
		}
		discardResp(resp)
		ctx.String(http.StatusOK, "ok")
	})

	ginServer := httptest.NewServer(gn)
	defer ginServer.Close()

	testCall := func() {
		ctx := context.Background()
		// our client doesn't want to wait for long
		ctx, cancel := context.WithTimeout(ctx, randDuration(0, 200*time.Millisecond))
		defer cancel()

		req, err := http.NewRequestWithContext(ctx, http.MethodGet, ginServer.URL, nil)
		if err != nil {
			b.Error(err)
			return
		}
		resp, err := http.DefaultClient.Do(req)
		if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
			return
		}
		if err != nil {
			b.Error(err)
			return
		}

		discardResp(resp)

		if resp.StatusCode != http.StatusOK {
			b.Fail()
		}
	}

	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			testCall()
		}
	})
}

func randDuration(from, to time.Duration) time.Duration {
	delta := to - from
	return from + time.Duration(rand.Int64N(int64(delta)))
}

func discardResp(resp *http.Response) {
	if resp == nil {
		return
	}
	_, _ = io.Copy(io.Discard, resp.Body)
	_ = resp.Body.Close()
}

Expectations

Just works

Actual result

DATA RACE as above

Environment

  • go version: go1.22.7 darwin/amd64
  • gin version: v1.10.0
@jerryyummy
Copy link

jerryyummy commented Dec 21, 2024

sync.Pool is not a cache in the true sense. It might be more appropriate to call it a "recycle bin". The data put into it is logically considered to have been deleted, but physically, the data still exists. This data can survive for two rounds of garbage collection time. i tested the code and i guess this will fail in high concurrency situation?
截屏2024-12-22 00 42 18
it seems that one goroutine has finished and the other get the same context and want to set sth, but if the routine has finished it should has returned the context to the pool so that the next can get the context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants