-
Notifications
You must be signed in to change notification settings - Fork 30
FIX: Fixing segmentation fault issue due to double free on SqlHandle::free in Linux #361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request fixes a critical segmentation fault issue that occurs during Python interpreter shutdown when freeing ODBC handles. The fix prevents double-free errors and invalid memory access by skipping handle cleanup for both STMT (Type 3) and DBC (Type 2) handles during Python shutdown, as their parent handles may already be destructed.
Key Changes:
- Extended the shutdown protection in
SqlHandle::free()to skip freeing both DBC and STMT handles during Python shutdown (previously only STMT handles were protected) - Added comprehensive test suite with 13 test cases covering various shutdown scenarios including aggressive segfault reproduction, GC finalization order, exception handling, and circular references
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| mssql_python/pybind/ddbc_bindings.cpp | Updated SqlHandle::free() to skip freeing both Type 2 (DBC) and Type 3 (STMT) handles during Python shutdown, preventing segfaults from accessing already-freed parent handles |
| tests/test_013_SqlHandle_free_shutdown.py | Added comprehensive test suite with 13 test cases using subprocess isolation to verify handle cleanup behavior during Python shutdown across all handle types (ENV, DBC, STMT) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/init.pyLines 85-104 85 This prevents resource leaks during interpreter shutdown by ensuring
86 all ODBC handles are freed in the correct order before Python finalizes.
87 """
88 # Make a copy of the connections to avoid modification during iteration
! 89 connections_to_close = list(_active_connections)
90
! 91 for conn in connections_to_close:
! 92 try:
93 # Check if connection is still valid and not closed
! 94 if hasattr(conn, "_closed") and not conn._closed:
95 # Close will handle both cursors and the connection
! 96 conn.close()
! 97 except Exception:
98 # Silently ignore errors during shutdown cleanup
99 # We're prioritizing crash prevention over error reporting
! 100 pass
101
102
103 # Register cleanup function to run before Python exits
104 atexit.register(_cleanup_connections)mssql_python/connection.pyLines 245-255 245 import mssql_python
246
247 if hasattr(mssql_python, "_register_connection"):
248 mssql_python._register_connection(self)
! 249 except (ImportError, AttributeError):
250 # If registration fails, continue - cleanup will still happen via __del__
! 251 pass
252
253 def _construct_connection_string(self, connection_str: str = "", **kwargs: Any) -> str:
254 """
255 Construct the connection string by parsing, validating, and merging parameters.📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.helpers.py: 67.5%
mssql_python.pybind.connection.connection.cpp: 73.6%
mssql_python.pybind.ddbc_bindings.h: 76.9%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.connection.py: 82.2%
mssql_python.cursor.py: 83.8%
mssql_python.logging.py: 85.3%🔗 Quick Links
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have put some comments to it - for the change in behavior.
@bewithgaurav since you had worked on this change earlier, please review this PR.
| from .pooling import PoolingManager | ||
|
|
||
| # Global registry for tracking active connections (using weak references) | ||
| _active_connections = weakref.WeakSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per my understanding, using WeakSet allows connections to be garbage collected before atexit cleanup runs, which means the segfault still occurs during GC?
The atexit cleanup runs too late and finds nothing?
I can imagine following scenario:
def query():
conn = connect(...) # Added to WeakSet
return conn.cursor(). fetchall()
# conn goes out of scope >> GC collects it
# SqlHandle::free() called >> SEGFAULT!
# atexit runs later, finds nothing (weakref died)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case should not be a concern, since GC doesn't run when things go out of scope
The scenario describes normal operation (not shutdown), where GC runs
During normal operation, pythonShuttingDown is False, so handles are freed properly anyways
The atexit mechanism only matters during actual interpreter shutdown
However, the WeakSet design means connections collected BEFORE shutdown won't be in the set
So, during:
Normal GC (during runtime): Works fine, handles freed normally
Shutdown GC: WeakSet may miss some connections, but those would be caught by the C++ shutdown protection anyway
@subrata-ms - please add if my analysis here is incorrect
Please also add a small description in comment of why we're using WeakRefs here - and what purpose are they serving (similar to whats added in ddbc bindings)
bewithgaurav
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments requesting changes
restructuring, adding tests covering multithreaded scenarios and some cleanup needed
| from .pooling import PoolingManager | ||
|
|
||
| # Global registry for tracking active connections (using weak references) | ||
| _active_connections = weakref.WeakSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case should not be a concern, since GC doesn't run when things go out of scope
The scenario describes normal operation (not shutdown), where GC runs
During normal operation, pythonShuttingDown is False, so handles are freed properly anyways
The atexit mechanism only matters during actual interpreter shutdown
However, the WeakSet design means connections collected BEFORE shutdown won't be in the set
So, during:
Normal GC (during runtime): Works fine, handles freed normally
Shutdown GC: WeakSet may miss some connections, but those would be caught by the C++ shutdown protection anyway
@subrata-ms - please add if my analysis here is incorrect
Please also add a small description in comment of why we're using WeakRefs here - and what purpose are they serving (similar to whats added in ddbc bindings)
| This module initializes the mssql_python package. | ||
| """ | ||
|
|
||
| import atexit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move connection lifecycle management to connection.py
The current implementation spreads connection lifecycle logic across two files (__init__.py and connection.py), requiring a circular import workaround with the lazy import mssql_python pattern inside Connection.__init__().
Recommend: Move _active_connections, _register_connection(), and _cleanup_connections() to connection.py as module-level code. This would:
- Eliminate the circular dependency pattern
- Keep all connection lifecycle logic in one place
- Reduce bloat in
__init__.py - Make the code more maintainable and easier to understand
The __init__.py should only handle module initialization and exports, not resource cleanup logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid calling this as “_active_connections” in naming, because what we are actually tracking here are active ODBC handles that remain open until Python shutdown.
Also, if we move this into connection.py, it gives the impression that the list is tied to individual Connection instances, which isn’t accurate. These are module-level driver resources. With clearer naming (e.g., _active_odbc_handles or _open_driver_handles), it may make more sense to keep this in init.py.
@bewithgaurav what do you think of this?
| This module initializes the mssql_python package. | ||
| """ | ||
|
|
||
| import atexit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid calling this as “_active_connections” in naming, because what we are actually tracking here are active ODBC handles that remain open until Python shutdown.
Also, if we move this into connection.py, it gives the impression that the list is tied to individual Connection instances, which isn’t accurate. These are module-level driver resources. With clearer naming (e.g., _active_odbc_handles or _open_driver_handles), it may make more sense to keep this in init.py.
@bewithgaurav what do you think of this?
Work Item / Issue Reference
Summary
Problem: Segmentation fault occurs during Python's garbage collection when freeing ODBC handles.
Stack Trace Analysis:
0-2: Signal handler (SIGSEGV)
3: libmsodbcsql-18.5.so.1.1 - CRASH LOCATION
4: SQLFreeHandle() from ODBC driver
5: SqlHandle::free() from ddbc_bindings
6-11: pybind11 call stack
12: Python method call
20: slot_tp_finalize during del
21-27: Python GC finalizing objects
Root Cause: The crash occurs when;
Fix Details:
This pull request introduces a critical fix to the handle cleanup logic in the SQL bindings for Python, specifically addressing potential segfaults during interpreter shutdown. The change ensures that both statement and database connection handles are not freed if Python is shutting down, preventing invalid memory access.
Handle cleanup logic improvements:
SqlHandle::free()method inmssql_python/pybind/ddbc_bindings.cppto skip freeing both statement (SQL_HANDLE_STMT) and database connection (SQL_HANDLE_DBC) handles during Python shutdown, rather than only statement handles. This prevents segfaults caused by freeing handles in the wrong order when their parent resources may have already been released.