Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add image_math tool #107

Merged
merged 5 commits into from
Mar 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions tools/image_math/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
categories:
- Imaging
description: Process images using arithmetic expressions
long_description: Process images using arithmetic expressions.
name: image_math
owner: imgteam
homepage_url: https://github.com/bmcv
remote_repository_url: https://github.com/BMCV/galaxy-image-analysis/tree/master/tools/image_math
90 changes: 90 additions & 0 deletions tools/image_math/image_math.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
import argparse
import ast
import operator

import numpy as np
import skimage.io


supported_operators = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
ast.FloorDiv: operator.floordiv,
ast.Pow: operator.pow,
ast.USub: operator.neg,
}


supported_functions = {
'sqrt': np.sqrt,
'abs': abs,
}


def eval_ast_node(node, inputs):
"""
Evaluates a node of the syntax tree.
"""

# Numeric constants evaluate to numeric values.
if isinstance(node, ast.Constant):
assert type(node.value) in (int, float)
return node.value

# Variables are looked up from the inputs and resolved.
if isinstance(node, ast.Name):
assert node.id in inputs.keys()
return inputs[node.id]

# Binary operators are evaluated based on the `supported_operators` dictionary.
if isinstance(node, ast.BinOp):
assert type(node.op) in supported_operators.keys(), node.op
op = supported_operators[type(node.op)]
return op(eval_ast_node(node.left, inputs), eval_ast_node(node.right, inputs))

# Unary operators are evaluated based on the `supported_operators` dictionary.
if isinstance(node, ast.UnaryOp):
assert type(node.op) in supported_operators.keys(), node.op
op = supported_operators[type(node.op)]
return op(eval_ast_node(node.operand, inputs))

# Function calls are evaluated based on the `supported_functions` dictionary.
if isinstance(node, ast.Call):
assert len(node.args) == 1 and len(node.keywords) == 0
assert node.func.id in supported_functions.keys(), node.func.id
func = supported_functions[node.func.id]
return func(eval_ast_node(node.args[0], inputs))

# The node is unsupported and could not be evaluated.
raise TypeError(f'Unsupported node type: "{node}"')


def eval_expression(expr, inputs):
return eval_ast_node(ast.parse(expr, mode='eval').body, inputs)


if __name__ == '__main__':

parser = argparse.ArgumentParser()
parser.add_argument('--expression', type=str, required=True)
parser.add_argument('--output', type=str, required=True)
parser.add_argument('--input', default=list(), action='append', required=True)
args = parser.parse_args()

inputs = dict()
im_shape = None
for input in args.input:
name, filepath = input.split(':')
im = skimage.io.imread(filepath)
assert name not in inputs, 'Input name "{name}" is ambiguous.'
inputs[name] = im
if im_shape is None:
im_shape = im.shape
else:
assert im.shape == im_shape, 'Input images differ in size and/or number of channels.'

result = eval_expression(args.expression, inputs)

skimage.io.imsave(args.output, result)
131 changes: 131 additions & 0 deletions tools/image_math/image_math.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
<tool id="image_math" name="Process images using arithmetic expressions" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="23.0">
<description>with NumPy</description>
<macros>
<token name="@TOOL_VERSION@">1.26.4</token>
<token name="@VERSION_SUFFIX@">0</token>
</macros>
<edam_operations>
<edam_operation>operation_3443</edam_operation>
</edam_operations>
<requirements>
<requirement type="package" version="1.26.4">numpy</requirement>
<requirement type="package" version="0.22.0">scikit-image</requirement>
</requirements>
<command><![CDATA[

## Inputs

python '$__tool_directory__/image_math.py'
--expression='$expression'
#for $item in $inputs:
--input='$item.name:$item.image'
#end for

## Outputs

--output='./result.tiff'

]]>
</command>
<inputs>
<param argument="--expression" type="text" label="Expression" optional="false">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit worried about the security of this tool.

Can you please look at this tool https://github.com/galaxyproject/tools-iuc/blob/main/tools/column_maker/column_maker.xml#L5

And see if you can get some inspirations on how to avoid arbitrary code executions.

For example we could also just allow a few operations abs/sqr etc ... and add more if there is a real need.
The linked tool above also adds the allowed tools to the help, so people know what they can use.

Am I overly careful here?

Copy link
Member Author

@kostrykin kostrykin Mar 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concerns, I was also a bit concerned at first. However, there also are Jupyter notebooks running on Galaxy, which also permit arbitrary code execution, right? Or are these apples and oranges, somehow?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apple and Organes, Jupyter is running only in Docker and is locked down. We also do not allow to run Jupyter 100 times with a lot of resources etc ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense. Then I will look for some other solution which avoids eval.

Copy link
Member Author

@kostrykin kostrykin Mar 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have purged the eval-based implementation and replaced it by a custom implementation using abstract syntax trees in 080ffe7. I think this should be safe, because only a small set of operators is permitted, only integer and float literals, and only variables passed in via --input.

And the only allowed function calls are abs and sqrt. I have updated the help section in 01eaf51. The help section now lists all supported operators and functions:

  • Addition, subtraction, multiplication, and division (+, -, *, /)
  • Integer division (e.g., input // 2)
  • Power (e.g., input ** 2)
  • Negation (e.g., -input)
  • Absolute values (e.g., abs(input))
  • Square root (e.g., sqrt(input))
  • Combinations of the above (also using parentheses)

If you prefer that, I think we could leave out Integer division, Power, Absolute values, and Square root for the moment, but I don't think that including these is an issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add all functions you like, I was just anxious about arbitrary eval() executions. My abs/sqr was just a stupid example.

Thanks a lot for changing the implementation so fast.

<validator type="regex">^[a-zA-Z0-9-_\*\+ \(\)/]+$</validator>
</param>
<repeat name="inputs" title="Input images" min="1">
<param name="image" type="data" format="png,tiff" label="Image" />
<param name="name" type="text" label="Variable for representation of the image within the expression" optional="false">
<validator type="regex">^[a-zA-Z_][a-zA-Z0-9_]*$</validator>
</param>
</repeat>
</inputs>
<outputs>
<data format="tiff" name="result" from_work_dir="result.tiff" />
</outputs>
<tests>
<!-- Multiplication with a scalar -->
<test>
<param name="expression" value="input1 * 2" />
<repeat name="inputs">
<param name="image" value="input1.tiff" />
<param name="name" value="input1" />
</repeat>
<output name="result" value="input1_times_2.tiff" ftype="tiff" compare="sim_size" delta="0" />
</test>
<!-- Unary negation operator -->
<test>
<param name="expression" value="-input1" />
<repeat name="inputs">
<param name="image" value="input1.tiff" />
<param name="name" value="input1" />
</repeat>
<output name="result" value="minus_input1.tiff" ftype="tiff" compare="sim_size" delta="0" />
</test>
<!-- Binary addition, neutral element, addition with scalar -->
<test>
<param name="expression" value="input1 + input2 + 1" />
<repeat name="inputs">
<param name="image" value="input1.tiff" />
<param name="name" value="input1" />
</repeat>
<repeat name="inputs">
<param name="image" value="minus_input1.tiff" />
<param name="name" value="input2" />
</repeat>
<output name="result" value="ones.tiff" ftype="tiff" compare="sim_size" delta="0" />
</test>
<!-- Parentheses -->
<test>
<param name="expression" value="(input1 + input2) / 2" />
<repeat name="inputs">
<param name="image" value="input1.tiff" />
<param name="name" value="input1" />
</repeat>
<repeat name="inputs">
<param name="image" value="ones.tiff" />
<param name="name" value="input2" />
</repeat>
<output name="result" value="half_of_input1_plus_one.tiff" ftype="tiff" compare="sim_size" delta="0" />
</test>
<!-- Abs -->
<test>
<param name="expression" value="abs(input)" />
<repeat name="inputs">
<param name="image" value="input1.tiff" />
<param name="name" value="input" />
</repeat>
<output name="result" value="input1_abs.tiff" ftype="tiff" compare="sim_size" delta="0" />
</test>
</tests>
<help>

This tool processes images according to pixel-wise arithmetic expressions.

The supported pixel-wise expressions are:

- Addition, subtraction, multiplication, and division (``+``, ``-``, ``*``, ``/``)
- Integer division (e.g., ``input // 2``)
- Power (e.g., ``input ** 2``)
- Negation (e.g., ``-input``)
- Absolute values (e.g., ``abs(input)``)
- Square root (e.g., ``sqrt(input)``)
- Combinations of the above (also using parentheses)

Examples:

- **Negate an image.**
Expression: ``-image``
where ``image`` is an arbitrary input image.

- **Mean of two images.**
Expression: ``(image1 + image2) / 2``
where ``image1`` and `image2` are two arbitrary input images.

- **Perform division avoiding division-by-zero.**
Expression: ``image1 / (abs(image2) + 1e-8)``
where ``image1`` and `image2` are two arbitrary input images.

</help>
<citations>
<citation type="doi">10.1038/s41586-020-2649-2</citation>
</citations>
</tool>
Binary file not shown.
Binary file added tools/image_math/test-data/input1.tiff
Binary file not shown.
Binary file added tools/image_math/test-data/input1_abs.tiff
Binary file not shown.
Binary file added tools/image_math/test-data/input1_times_2.tiff
Binary file not shown.
Binary file added tools/image_math/test-data/minus_input1.tiff
Binary file not shown.
Binary file added tools/image_math/test-data/ones.tiff
Binary file not shown.