Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request : Support Pueue as executor #5642

Open
GhisF opened this issue Jan 6, 2025 · 8 comments
Open

Feature request : Support Pueue as executor #5642

GhisF opened this issue Jan 6, 2025 · 8 comments

Comments

@GhisF
Copy link

GhisF commented Jan 6, 2025

New feature : Add an executor for Pueue.

Suggestion : Add Pueue as local executor.

It is a very simple client/server queue manager for long running tasks running completely, it is simple ( single user, run in the user space, no hardware or cluster awareness ... ) but features rich and dead simple to install. It could be very useful in some scenarios, for example for developing/testing or run workflows on an instruments dedicated small infrastructure where all the advanced features of a batch manager like slurm are not required.

Regards.

@pditommaso
Copy link
Member

This actually sounds interesting. What's you use case?

@pditommaso
Copy link
Member

Let me rephrase above question, what would be the main advantage of using pueue as executor for Nextflow over the standard local executor?

@GhisF
Copy link
Author

GhisF commented Jan 6, 2025

Pueue is closer to something like a simplified Slurm than the local executor. It has notion of tasks, group of tasks, tasks order, tasks dependencies, how many tasks can run in parallel on the system/in a given group, you can interact with the list of tasks to change tasks orders/priority, you can also send tasks on a remote node...

There is a features list on the GitHub page of the project.

I actually in the process of moving to NextFlow the pipeline used to process the data of our sequencers. It is running on an infrastructure with heterogeneous hardware (dragen nodes, GPUs nodes, CPU nodes) with no users. Pueue is a very convenient tool to manage queues of tasks (and Slurm is definitely overkill, and cannot be installed on some nodes)

The port is almost finish using the local executor, but I had to use helper scripts to do things I would consider as a part of the workflow :

  • Managing queues. Since there is Slurm and Hyperqueue, why not Pueue (This ticket)
  • Sending the tasks to the correct node (dragen tasks a dragen node, deepvariants to a GPU node... ) and wait for the completion. Perhaps it can be delegated to ssh/pueue if there is no better option ?

Thank you.

@bentsherman
Copy link
Member

The Pueue readme mentions that it is intended only for interactive use:

Even though it can be scripted to some degree, it hasn't been built for this and there's no official support!

So I would rather not implement it as a core executor in Nextflow, but you could certainly do it in a plugin.

@GhisF
Copy link
Author

GhisF commented Jan 7, 2025

Yes the Pueue client is a command line, you are suppose to use directly from the terminal, (but it work well from scripts)

I looked at the hyperqueue (1) and slurm (2) executor source files, and they interact with the hq or sbatch/squeue command lines. Doing the same with pueue shouldn't be different at all.

I you think writing an executor for pueue doesn´t worth it, Is it possible to write an executor similar to the HyperQueueExecutor.groovy as a module ? the documentation mentions only nf scripts or externally interpreted scripts (sh, py...).

1 : modules/nextflow/src/main/groovy/nextflow/executor/HyperQueueExecutor.groovy
2 : modules/nextflow/src/main/groovy/nextflow/executor/SlurmExecutor.groovy

Regards.

@pditommaso
Copy link
Member

I believe supporting it could be quite straightforward, tho I still not fully getting the added value (from a nextflow pipeline execution point of view) in place of using the standard local execution.

@bentsherman
Copy link
Member

@GhisF you could implement a Pueue executor in a plugin as shown here. But I agree with Paolo that I'm not sure what benefit Pueue would provide over the local executor in Nextflow

@GhisF
Copy link
Author

GhisF commented Jan 8, 2025

l think Pueue add a more powerfull and convinient tasks managment over the local executor like : group of tasks, number of tasks allowed to run in parallel globally or for in a specific group of tasks, able to manage the queue (for example live managing the order of tasks in the queue).
In addition the queue is managed at the machine level, so for example several nextflow instances can send tasks requiring an exclusive access to a specific piece of hardware (dragen, GPU) without conflict.

Thank you for pointing me at the correct documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants