Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 7 additions & 13 deletions lightllm/server/api_start.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,13 +337,11 @@ def normal_or_p_d_start(args):
],
)

# 启动 gunicorn
# 启动 Hypercorn
command = [
"gunicorn",
"hypercorn",
"--workers",
f"{args.httpserver_workers}",
"--worker-class",
"uvicorn.workers.UvicornWorker",
"--bind",
f"{args.host}:{args.port}",
"--log-level",
Expand All @@ -353,7 +351,7 @@ def normal_or_p_d_start(args):
"--error-logfile",
"-",
"lightllm.server.api_http:app",
"--timeout",
"--read-timeout",
f"{get_lightllm_gunicorn_time_out_seconds()}",
Comment on lines +354 to 355
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There are two potential issues with this timeout configuration change:

  1. Behavioral Change: Hypercorn's --read-timeout is not a direct replacement for Gunicorn's --timeout.

    • Gunicorn's --timeout is a worker inactivity timeout. It kills and restarts a worker if it's silent for too long, which helps prevent workers from getting stuck on long-running requests.
    • Hypercorn's --read-timeout only limits the time spent reading the request headers and body. It does not apply to the time spent processing the request in your application code.
      This means that long-running inference tasks will no longer be timed out, which could lead to workers being occupied indefinitely. Please confirm if this is the intended behavior. If you need to timeout long-running requests, you might need to implement a timeout mechanism within your FastAPI application logic.
  2. Misleading Naming: The function get_lightllm_gunicorn_time_out_seconds() and its underlying environment variable LIGHTLMM_GUNICORN_TIME_OUT are now misleading since Gunicorn is no longer used. For better maintainability, these should be renamed to reflect their new purpose with Hypercorn, for example, get_lightllm_hypercorn_read_timeout_seconds() and LIGHTLLM_HYPERCORN_READ_TIMEOUT.

This same issue applies to the other places where gunicorn was replaced (lines 421-422 and 461-462).

"--keep-alive",
f"{get_lightllm_gunicorn_keep_alive()}",
Expand Down Expand Up @@ -407,11 +405,9 @@ def pd_master_start(args):
)

command = [
"gunicorn",
"hypercorn",
"--workers",
"1",
"--worker-class",
"uvicorn.workers.UvicornWorker",
"--bind",
f"{args.host}:{args.port}",
"--log-level",
Expand All @@ -422,7 +418,7 @@ def pd_master_start(args):
"-",
"--preload",
"lightllm.server.api_http:app",
"--timeout",
"--read-timeout",
f"{get_lightllm_gunicorn_time_out_seconds()}",
"--keep-alive",
f"{get_lightllm_gunicorn_keep_alive()}",
Expand All @@ -449,11 +445,9 @@ def config_server_start(args):
set_env_start_args(args)

command = [
"gunicorn",
"hypercorn",
"--workers",
"1",
"--worker-class",
"uvicorn.workers.UvicornWorker",
"--bind",
f"{args.config_server_host}:{args.config_server_port}",
"--log-level",
Expand All @@ -464,7 +458,7 @@ def config_server_start(args):
"-",
"--preload",
"lightllm.server.config_server.api_http:app",
"--timeout",
"--read-timeout",
f"{get_lightllm_gunicorn_time_out_seconds()}",
"--keep-alive",
f"{get_lightllm_gunicorn_keep_alive()}",
Expand Down
6 changes: 5 additions & 1 deletion lightllm/server/req_id_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,10 +125,14 @@ def _find_sibling_processes():

# 查找兄弟进程
sibling_processes = []
for proc in psutil.process_iter(["pid", "name"]):
for proc in psutil.process_iter(["pid", "name", "cmdline"]):
try:
# 检查是否是兄弟进程(同一父进程且不是当前进程)
if proc.pid != current_pid and proc.ppid() == parent_process.pid:
# 过滤掉 multiprocessing.resource_tracker 进程
cmdline = proc.cmdline()
if cmdline and "multiprocessing.resource_tracker" in " ".join(cmdline):
continue
sibling_processes.append(proc)
except (psutil.NoSuchProcess, psutil.AccessDenied):
continue
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ ujson==5.10.0
frozendict==2.4.6
atomics==1.0.3
easydict==1.13
gunicorn==23.0.0
hypercorn==0.18.0
flashinfer-python==0.2.4
sgl-kernel==0.3.7.post1
httpx==0.28.1
Expand Down