Skip to content

Python 3.12 xgboost.core.XGBoostError: Invalid Parameter format for nthread expect int but value='-1' when DMatrix used with import googlecloudprofiler. #144

@hwlodarczyk-rtbh

Description

@hwlodarczyk-rtbh

This issue was originally posted in xgboost repo dmlc/xgboost#10224 .

Hi

I have a very peculiar error which happened when I've updated versions of Python and libs in project I'm working on.

Minimal example to reproduce the case is this:

# file.py
import googlecloudprofiler
from xgboost import DMatrix

DMatrix([[]])
print("works")
# requirements.txt
xgboost==2.0.3
google-cloud-profiler==4.1.0
#
numpy==1.26.4
scipy==1.13.0
google-api-python-client==2.125.0
google-auth==2.29.0
google-auth-httplib2==0.2.0
protobuf==4.25.3
requests==2.31.0
#
cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
google-api-core==2.18.0
httplib2==0.22.0
idna==3.6
pyasn1==0.6.0
pyasn1_modules==0.4.0
pyparsing==3.1.2
rsa==4.9
uritemplate==4.1.1
urllib3==2.2.1

Python 3.12.2

Install with

pip install -r requirements.txt --no-deps

Run with

python file.py

Results in

Traceback (most recent call last):
  File "/project/path/file.py", line 4, in <module>
    DMatrix([[]])
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 730, in inner_f
    return func(**kwargs)
           ^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 857, in __init__
    handle, feature_names, feature_types = dispatch_data_backend(
                                           ^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 1081, in dispatch_data_backend
    return _from_list(data, missing, threads, feature_names, feature_types)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 1011, in _from_list
    return _from_numpy_array(array, missing, n_threads, feature_names, feature_types)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 207, in _from_numpy_array
    _check_call(
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 282, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Parameter format for nthread expect int but value='-1'

To "solve" the problem remove import googlecloudprofiler from file.py. I really have no idea why just importing the lib causes this problem; it would make more sense after googlecloudprofiler.start is called.

Moreover the code works for xgboost=1.7.6 and fails since xgboost=2.0.0.

Maintainer of xgboost mentioned

loading the _profiler.cpython-312-x86_64-linux-gnu.so inside google profiler extension causes the error

dmlc/xgboost#10224 (comment)

This is why I've opened issue here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions