Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to run independent tasks in parallel #626

Open
theHamsta opened this issue May 20, 2020 · 52 comments
Open

Is there a way to run independent tasks in parallel #626

theHamsta opened this issue May 20, 2020 · 52 comments

Comments

@theHamsta
Copy link

I wanted to ask whether we could extend just that it runs independent dependencies in parallel similar to make. Or is this feature somewhere hidden somewhere?

@casey
Copy link
Owner

casey commented May 20, 2020

It isn't possible at the moment, but I think this would be a cool feature. I think the best way to implement it would be to add annotations, and then define an annotation that makes a recipe run in the background.

@roblav96
Copy link

@casey I'm absolutely in love with using a justfile, coming from a package.json npm scripts background.

Almost every npm script I write uses npm-run-all --parallel.

Keep up the great work friend! Cheers 🍻

@casey
Copy link
Owner

casey commented Jul 16, 2020

Thanks Robert, I appreciate kind words!

I'm glad to learn about npm-run-all --parallel, and agree that this would be a very worthwhile feature.

I think that there are a few features that are languishing, awaiting annotations. I've been dragging my heels on adding annotations, but since there are a bunch of worthy features that need them, hopefully I'll get around to it sooner rather than later.

@roblav96
Copy link

No worries, nothing but time around here. lol

I ended up replicating somewhat of the same workflow in combination with Nukesor/pueue 😂

@casey
Copy link
Owner

casey commented Jul 16, 2020

Definitely sub-optimal than having it built into Just, but glad you found something that works! I'll have to check out pueue, it looks dope.

@theHamsta
Copy link
Author

My solution is to have a just command that's invoking make that's invoking just. 😄 It works. Dunno whether it should work.

@casey
Copy link
Owner

casey commented Jul 16, 2020

My solution is to have a just command that's invoking make that's invoking just. 😄 It works. Dunno whether it should work.

That sounds like a highly reasonable solution :)

pw1l4

@casey
Copy link
Owner

casey commented Sep 5, 2020

I think I misunderstood this motivation behind this feature, and thus how it might be implemented.

Is the desire to run some recipes in a justfile in parallel, or all recipes in parallel? Let's call the former selective parallelism, and the latter universal parallelism.

I was thinking that people wanted selective paralellism, so perhaps you could annotate a recipe to say that its dependencies should run in parallel. So, for example, to run a, b, and c in parallel when running foo:

#[parallel]
foo: a b c

But, I think people actually want universal parallelism, people want to run all recipes in a justfile in parallel, or at least be able to pass a flag that all recipes should run in parallel.

I think this latter behavior is probably more useful, since it requires fewer annotations and thought on the part of users, and since parallelism could be selectively limited through the use of dependency constraints.

Currently, Just runs dependencies of a recipe in-order (barring dependencies between those dependencies), which users might have come to rely on.

There are a few possibilities:

  1. A cli flag that lets you run recipes in parallel, like --parallel. This has the downside that if a justfile relies on dependency ordering, it will break when run with the flag.

  2. A setting that says "this justfile expects its recipes to run in parallel", maybe parallel := true, that enables universal parallelism.

Does this seem like a good summary of what people want?

@mortoray
Copy link

mortoray commented Sep 5, 2020

For my use case, from #676 I want selective parallelism, and I'd prefer to have this specified in a target. Command line parameters remove some of the usefulness of "just" remembering what I want.

@casey
Copy link
Owner

casey commented Sep 5, 2020

For my use case, from #676 I want selective parallelism, and I'd prefer to have this specified in a target. Command line parameters remove some of the usefulness of "just" remembering what I want.

Can you elaborate on why universal parallelism is undesirable, or why it would break the justfile?

@mortoray
Copy link

mortoray commented Sep 5, 2020

Can you elaborate on why universal parallelism is undesirable, or why it would break the justfile?

I collect many different types of tasks in my justfile, these have different running requirements.

  • Watching build tasks, several that watch files are rebuild things as they change (parallel)
  • Packaging tasks, these need to be stepwise, since once can depend on the previous (sequence)
  • One-off commands, these have no real ordering, and will be invoked one at a time as needed

That said, I guess it depends on how the parallelism is specified. My sequential tasks don't use dependencies, they instead do recursive invocation of "just".

Thinking about that, perhaps a --parallel flag would be okay for my case, since I would have a target the invokes "just" with that flag, specifying the targets I want.

That is, I think the details of how this is implemented will decide whether it's an issue or not.

@casey
Copy link
Owner

casey commented Sep 5, 2020

The question that I'm most curious about is whether or not people are depending on the fact that dependencies of a recipe without interdependencies run in order. I'm guessing that they don't, and most people use explicit dependencies to order recipes that cannot run in parallel.

The reason I'm interested in that is because if people don't rely on this implicit dependency ordering, then command line flags or config options make a lot more sense, since justfiles wouldn't be likely to break if they suddenly ran in parallel.

I definitely agree that command-line flags are less convenient. I think a command-line flag would be good to start with, just as a simple way to prototype the feature.

@roblav96
Copy link

roblav96 commented Sep 6, 2020

Problem

Concurrently run multiple commands in parallel via one single command definition.

Example

Using npm-run-all --parallel in my package.json

"scripts": {
	"watch": "del dist; npm-run-all --silent --parallel watch:*",
	"watch:nodemon": "wait-for-change dist/index.js && delay 0.1 && nodemon dist/index.js",
	"watch:tsc": "tsc --watch --preserveWatchOutput",
},

@jrop
Copy link

jrop commented Feb 22, 2021

For me, parallelized tasks would help speed up my build. Here is a simple case, but in my real-world use-case, I have around 15 modules, forming a complex dependency graph:

a:
  #!/usr/bin/env bash
  cd a
  ./build.sh
b:
  #!/usr/bin/env bash
  cd b
  ./build.sh
c: a b
  #!/usr/bin/env bash
  cd a
  ./build.sh

In this case, I want the build to happen like:

a     b
|     |
 \   /
   v
   c

If a and b build at the same time, that will speed up the build.

@mbodmer
Copy link
Contributor

mbodmer commented Nov 2, 2021

I agree with @jrop's usecase, but I also have e.g. an all recipe, which depends on configure, build, packaging, deploy recipes. Here I need the sequence in order.
But when I deploy to multiple hosts, the deploy recipe for each host could run in parallel.

@hartmannr76
Copy link

Just a thought, wouldn't something like Make's -j flag fit for this? https://www.gnu.org/software/make/manual/make.html#Parallel

Seems to be the way to support it that would be consistent with the idea behind the project

@jrop
Copy link

jrop commented Nov 4, 2021

The trick seems to be that sometimes, some will want the tasks to run in series, and sometimes in parallel. It seems to me that some extra syntax would need to be defined for dependencies. Say:

a:
  ..
b:
  ..

c: a > b # where `>` means series
# or
c: a | b # where `|` means parallel
# and if there were "groupings":
c: (a > b) | u | v

I'm not proposing this as the final syntax, but something like this would be useful in a task runner.

@madig
Copy link

madig commented Nov 13, 2021

I was thinking that people wanted selective paralellism, so perhaps you could annotate a recipe to say that its dependencies should run in parallel. So, for example, to run a, b, and c in parallel when running foo:

This is something I'd like to see. I have four recipes that use the same sources but do slightly different things and are independent from one another, so something like foo: a b c d launching the tasks in parallel would be nice.

@saskenuba
Copy link

Of course, it is not an ideal solution but works fine for tasks that don't end until manual termination, such as spinning servers.

At my justfile below, when ran default, it opens another terminal with a task that doesn't end, and in parallel, runs my development server.
Perhaps this helps someone 😄

set dotenv-load

default: run-meilisearch watch-jq

run-meilisearch:
	setsid alacritty --working-directory=. -e docker run -it -p 7700:7700 -e "MEILI_MASTER_KEY=$MEILISEARCH_MASTER_KEY" -v data-ms:/app/.data-ms  getmeili/meilisearch:v0.26.0rc0 &

watch:
	~/.cargo/bin/systemfd --no-pid -s http::5001 -- cargo watch -x run -q | jq

watch-jq:
	@echo Waiting 5 seconds to ensure meilisearch starts
	sleep 5 && setsid alacritty --working-directory=. -e just watch &

release-jq:
	~/.cargo/bin/systemfd --no-pid -s http::5001 -- cargo run --release | jq

@runeimp
Copy link

runeimp commented Feb 25, 2022

Couldn't you just do...

a:
	# Long running process

b:
	# Long running process


parallel-sh:
	just a & # runs in the background so errors (probably?) get ignored by this recipe
	just b # runs in the foreground and treated normally regarding errors


parallel-cmd:
	start /b just a
	just b

...in most cases?

My-Machine:best-project-ever account$ just parallel-sh

-or-

C:\Users\account\Projects\Best-Ever> just parralel-cmd

Or is more consistent error handing necessary?

@wearpants
Copy link

To add to the use cases, I have daydreams about using just to replace Apache Airflow (a data engineering orchestration tool)

@k3d3
Copy link

k3d3 commented Oct 9, 2022

Adding in my use case for something like this, I have two commands: one to run a vite (JS) dev server, and one command to run a cargo backend web server.

I'd like one command that runs both at the same time, and kills them both when I hit Ctrl+C.

@whyboris
Copy link

It seems like this is already possible. My justfile:

pdf:
  hugo serve -D & sleep 5 && cd pdf && npm start

Starts my Hugo server, and at the same time waits 5 seconds, changes directory, and runs npm start

The secret is & which seems to run things in parallel 🤔

@k3d3
Copy link

k3d3 commented Jan 12, 2023

Good to know! I see a lot of people using & to delegate the task to the shell.

Now, when you run just pdf, does it also stop the hugo server when you Ctrl+C the command?

@runeimp
Copy link

runeimp commented Jan 14, 2023

Just want to point out that the & thing only works on Unix type shells (Linux, macOS, etc.). On Windows systems this does not work in PowerShell or CMD (Command Prompt).

@xavierzwirtz
Copy link

Something similar to the docker compose ux would be nice. Some of the projects I use use docker compose simply as a background task runner because the UX is good. Instead being able to just up -d and launch native processes in the background would be fantastic.

@timdp
Copy link

timdp commented Jan 25, 2023

When I first adopted just, I expected dependencies to run in parallel. It bums me out that this is so difficult to achieve.

For the sequential case,

parent: child1 child2
  stuff

is merely syntactic sugar for:

parent:
  just child1
  just child2
  stuff

which really doesn't add all that much. Conversely, getting child1 and child2 to run in parallel involves introducing additional tooling and less readable configuration files. This is strange to me.

Hence, I would argue that just can make a way bigger difference by enabling parallel execution than by solving an already solved problem. There's a big incentive to add it—even for the base case of running all dependencies in parallel, because that alone would already unlock composition of more complex flows.

I also want to add that in the JS ecosystem, a long time ago, task runners like Grunt and Gulp struggled with basically the same challenge.

@huyz
Copy link

huyz commented Feb 2, 2023

Has anyone taken a look at Taskfile?

It both runs dependencies in parallel and supports a command line flag to run the specified tasks in parallel:
https://taskfile.dev/usage/#task-dependencies

@casey
Copy link
Owner

casey commented Feb 7, 2023

@timdp Definitely agree this is important and one of the biggest missing feature! I actually took a crack at this, but ran into weird lifetime/sync/send issues, and the code was really ugly, so I tabled it, but if someone else wants to take a shot, they definitely could. I created a project that does NFTs on Bitcoin called Ordinals, and it's popping off, so my review bandwidth is extremely limited, just a heads up.

@syphar
Copy link

syphar commented Mar 4, 2023

I created a draft PR to implement this feature, following the pattern from Taskfile (parallel execution of dependencies, parallel task execution when given multiple tasks on the commandline).

More work to be done, but depends on answers by maintainers.

@ravenclaw900
Copy link

ravenclaw900 commented Mar 11, 2023

You can get it to run in parallel and stop all processes at the end fairly easily, assuming you're using bash:

dev:
  #!/bin/bash -eux
  cmd1 &
  cmd2 &
  trap 'kill $(jobs -pr)' EXIT
  wait

The wait is necessary to prevent it from just ending after starting both processes. However, when Ctrl+Cing Just, it will force exit the script, stopping both processes.

@timdp
Copy link

timdp commented Mar 11, 2023

Yeah, but then you might as well create a scripts folder and do everything in pure Bash. That's what I'm trying to avoid, personally. Just has a real opportunity to improve the experience.

@syphar
Copy link

syphar commented Mar 11, 2023

btw, while I don't have any answer from any maintainer yes, #1562 already works for the things I wanted to work.

Sadly cargo install from git doesn't work from this branch,
but in any case I would highly appreciate more people testing what I did, and getting feedback.

These things would work with my PR:

1. Run the given recipes on the command line in parallel:

$ just --parallel recipe_1 recipe_2 recipe_3
[...]

2. using the [parallel] attribute, task dependencies are allowed to run in parallel:

recipe_1:
  sleep 1
recipe_2:
  sleep 2
[parallel]
foo: recipe_1 recipe_2
  echo hello

Locally I'm using both ways already.

@iovis
Copy link

iovis commented Jun 12, 2023

One workaround working for me if you use tmux is to make it launch in different windows. That way you can also monitor separately:

full:
    tmux new-window 'just server'
    tmux new-window 'just worker'

@srid
Copy link

srid commented Jul 26, 2023

I think what we want is a Procfile like support in justfile, so we don't have to use yet another tool like honcho for it. @syphar Does your PR interleave process output like these Procfile runners do? Does it work for long-running processes?

@syphar
Copy link

syphar commented Jul 26, 2023

I think what we want is a Procfile like support in justfile, so we don't have to use yet another tool like honcho for it. @syphar Does your PR interleave process output like these Procfile runners do? Does it work for long-running processes?

Now that is a PR I didn't think about for a long time ;)

From what I remember, it works for long-running processes, and does interleave the output.

A major difference to heroku local -f or probably honcho is that the output isn't prefixed with the process / task, which could be added at a future point in time.

@srid
Copy link

srid commented Jul 26, 2023

From what I remember, it works for long-running processes, and does interleave the output.

Nice.

the output isn't prefixed with the process / task

This would 'seal the deal' and distinguish just greatly as an alternative to all those Procfile-based runners. Looking forward to it! (I'd implement myself if only I had the time for it ...)

@Ekleog
Copy link

Ekleog commented Jan 17, 2024

I'll add one tidbit around this: it'd be awesome if just used the jobserver crate to implement the make jobserver for downstream programs. In particular, it would make just able to parallelize basically all invocations of cargo to exactly the number of cores of the machine, rather than exploding parallelism and spawning more rustc processes than cores :)

@gsemet
Copy link

gsemet commented Mar 28, 2024

I love the '[parallel]' idea to declare tasks that can be parallelized safely. For instance in CI, I usually perform all checks in // (make checks -j4), but using this special syntax for just, i would do something like just parallel-checks. This would allow me to "continue" with other non-parallelizable tasks for instance.

Something like this chain in a just file could be feasible

stylechecks: style checks

[parallel]
checks: bandit pylint ....

a new parameter --parallel N would still be requires to find how many workers would be started.

My main pbl with make -j is "identifying who really fails in case of error".

@hauleth
Copy link

hauleth commented Apr 10, 2024

Instead I would prefer that each task is by default independent, so I do not need to write anything extra to run tasks in parallel, and instead there could be an option to mark that some task conflict with another. Because with [parallel] meta attribute that mean that just this task is parallelizable, or all dependencies of this task are parallelizable? What if it is parallelizable, but only when it do not run together with some other task? [parallelizable] mean that this task can run in parallel to others or it mean that this tasks dependencies will be run in parallel?

@chaoky
Copy link

chaoky commented Apr 25, 2024

I've been using just with concurrently in the mean time, it's pretty good

@hauleth
Copy link

hauleth commented Apr 25, 2024

Unfortunately concurrently do not fully resolves the problem, as it run only top level tasks in parallel. That will fail if 2 tasks have common dependency that cannot be serialised.

@W1M0R
Copy link

W1M0R commented Jun 27, 2024

Another option might be to introduce additional syntax for dependencies:

# dependencies executed sequentially
tasks: task1 task2 task3

# task1 executes, then task body and then task2 and task 3 (already implemented)
# https://just.systems/man/en/chapter_42.html?highlight=middl#running-recipes-at-the-end-of-a-recipe
tasks: task1 && task2 task3

# execute task1 and task2 in parallel, and when task2 finishes continue with task3
tasks: task1 & task2 task3

# execute task1 and task2 and task3 in parallel
tasks: task1 & task2 & task3

So occurrences of & indicate parallel tasks, similar to the syntax for background jobs in a shell, but with more power (e.g no zombie processes etc).

@hauleth
Copy link

hauleth commented Jun 27, 2024

@W1M0R I do not understand why the dependant should define order of the tasks to be run. This also introduces problem when for example task1 and task2 depends on task0 - it will be run twice or once?

@W1M0R
Copy link

W1M0R commented Jun 28, 2024

If both depend on task 0 then it should run once.

In this example, the author of the tasks recipe knows that the individual tasks can be executed in parallel without interfering with each other.

There may be other tasks that shouldn't be run in parallel, i.e. one that deletes a folder and another one that creates that folder. The recipe author should get to decide which tasks it wants to have executed in parallel.

@W1M0R
Copy link

W1M0R commented Aug 22, 2024

@theHamsta

For long running tasks that need to run in parallel, I call into the following Taskfile.yaml:

version: '3'

interval: 2s

tasks:

  # Also see: https://taskfile.dev/usage/#watch-tasks
  dev-templ: just dev-templ
  dev-astro: just dev-astro
  dev-go: just dev-go

  dev:
    desc: Run the long-running watches in parallel (Just can't do parallel tasks yet)
    deps: [dev-templ, dev-astro, dev-go] 

The justfile:

dev-up: 
  task dev

Running just dev-up will call task dev. The Taskfile calls back into long-running just recipes, running those recipes in parallel, and stopping them with ctrl+c. It would be great if it wasn't necessary to shell out to another task runner (or tool, such as gnu parallel, watchexec, etc) to accomplish this.

@yonas
Copy link

yonas commented Aug 22, 2024

@W1M0R I also like using Goreman for this. Hopefully this will be possible in just soon.

@W1M0R
Copy link

W1M0R commented Aug 22, 2024

Thanks for the tip @yonas

@ostrolucky
Copy link

I am not sure if I'm not leering too much into OT, but since this is still in a conceptual phase, I want to say I never liked how these concurrent task runners display the progress by mixing output of all the tasks into one, garbled output. I much prefer pane split where each task's output is appended to individual panel, like this https://github.com/jamespan/tmux-parallel/tree/master

@runeimp
Copy link

runeimp commented Dec 13, 2024

@ostrolucky that might be fine for 2 – 8 tasks if we rely on the presence of Tmux. But even in that case what if there are 200 concurrent tasks? Without resorting to something like an external tool that can tie into the output muxing within Just it's going to be a complete mess anyway. I don't see a lot of good options beyond said external type of tool.

@casey casey added the coveted label Dec 21, 2024
@hmvp
Copy link

hmvp commented Dec 23, 2024

My Two cents.. I currently use make to run different tools in a number of sub directories and I like that I can add -J4 to run them parallel

My top level makefile looks like this:

# Register all subdirectories in the project's root directory.
SUBDIRS := $(dir $(wildcard */Makefile))

all lint mypy test: $(SUBDIRS)
.PHONY: all lint mypy test

# Recurse `make` into each subdirectory
# Pass along targets specified at command-line (if any).
$(SUBDIRS):
	$(MAKE) -C $@ $(MAKECMDGOALS)

The subdirs all have all lint mypy test as recipes. And all is defined as all: lint mypy test.
With make this would run all tools sequential when not using the -J option.
With -JX this happens to run all directories in parallel but within the dirs they would still run sequential if the number of dirs is equal to the number of parallel threads. This because make honors the order of the dependencies as order to execute..

So the output would be something like:

dir1 lint
dir2 lint
dir2 mypy
dir1 mypy
dir2. test
dir1 test

I would like to be able to replicate this in Just... And it would be nice if the number of parallel threads is independent from the number of subdirs or recipes...

TLDR: I Would want to set the parallelism globally with a cli parameter --parallel and I expect the order of the dependencies to matter. I don't expect it to run all recipes but just the ones given on the command line...
I expect the output to be a mess, but it would be nice if that was not the case...

@fzyzcjy
Copy link

fzyzcjy commented Dec 30, 2024

Hi, is there any updates? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests