Skip to content

Conversation

@delner
Copy link
Collaborator

@delner delner commented Dec 19, 2025

Problem

Many of my new tests were failing locally with WebMock errors.

The issue is related to test duration, but it's more specifically about orphan background threads and stub lifecycle:                                                                                                                                        
                                                                                                                                                                                                                                                              
The sequence:                                                                                                                                                                                                                                                
                                                                                                                                                                                                                                                              
1. Test A creates State with api_key: "test-key" (no org_id) → spawns background login thread                                                                                                                                                                
2. Login fails (stub returns 500, or cassette doesn't have login stub) → thread enters exponential backoff retry                                                                                                                                             
3. Test A completes → WebMock.reset! clears stubs (or VCR cassette is ejected)                                                                                                                                                                               
4. Test B starts → registers new stub with different Authorization header (e.g., Bearer <BRAINTRUST_API_KEY>)                                                                                                                                                
5. Orphan thread wakes up, tries login with Authorization: Bearer test-key                                                                                                                                                                                   
6. Stub doesn't match → WebMock::NetConnectNotAllowedError                                                                                                                                                                                                   
                                                                                                                                                                                                                                                              
Why some CI jobs are more affected:                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                                              
- More tests = more opportunities for:                                                                                                                                                                                                                       
  - Threads to be spawned                                                                                                                                                                                                                                    
  - Threads to be in backoff when stubs change                                                                                                                                                                                                               
  - New tests to register incompatible stubs                                                                                                                                                                                                                 
- Longer test duration increases window for race condition      

Solution

There's a magic test key in the code:                                                                                                                                                                                                                        
                                                                                                                                                                                                                                                               
# In lib/braintrust/api/internal/auth.rb                                                                                                                                                                                                                     
if api_key == "test-api-key"                                                                                                                                                                                                                                 
  Log.debug("Login: using test API key, returning fake auth")                                                                                                                                                                                                
  return AuthResult.new(org_id: "test-org-id", ...)                                                                                                                                                                                                          
end                                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                              
So api_key: "test-api-key" skips HTTP entirely and returns fake auth.

This also happens to short-circuit the login thread. Since this is the current idiomatic approach, apply it for now. In the future, we should eliminate this "magic" behavior and keep test behavior isolated to the test suite.

@delner delner self-assigned this Dec 19, 2025
@delner delner requested review from clutchski and realark December 19, 2025 00:35
@delner delner merged commit aeab52d into main Dec 19, 2025
7 checks passed
@delner delner deleted the fix/login_thread_leak branch December 19, 2025 00:40
# Note: "test-api-key" triggers fake auth to avoid HTTP requests
state = Braintrust.init(
api_key: "test-key",
api_key: "test-api-key",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constant?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants