Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Way to kill an inference without killing the server #11173

Open
4 tasks done
paoletto opened this issue Jan 10, 2025 · 4 comments
Open
4 tasks done

Feature Request: Way to kill an inference without killing the server #11173

paoletto opened this issue Jan 10, 2025 · 4 comments
Labels
enhancement New feature or request

Comments

@paoletto
Copy link

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Curl'ed inference requests occasionally get stuck, either in llama-server or in whisper-server.
as other requests are queued, killing the server would make every other queued requests fail.
So i wonder: how to kill one of the ongoing tasks (could be multiple if the server is set to use more than 1 concurrent job), without having to actually kill the server, and so letting the queued task move on?

Motivation

Requests occasionally get stuck and block the server

Possible Implementation

additional API endpoint taking task id?

@paoletto paoletto added the enhancement New feature or request label Jan 10, 2025
@paoletto
Copy link
Author

paoletto commented Jan 10, 2025

could be duplicate of #6421, although a dedicated way to kill current jobs i believe could have merit on its own

@ngxson
Copy link
Collaborator

ngxson commented Jan 10, 2025

I mentioned the same problem a while ago here: #9273

Will revisit it in a few days.

@ngxson
Copy link
Collaborator

ngxson commented Jan 12, 2025

FYI, it's now depending on upstream library httplib: yhirose/cpp-httplib#2017

As soon as they have a solution, then we can resolve this problem.

@paoletto
Copy link
Author

FYI, it's now depending on upstream library httplib: yhirose/cpp-httplib#2017

As soon as they have a solution, then we can resolve this problem.

Thank you. If i underestand the ticket on httplib correctly, this would then work by killing the http client, correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants