Skip to content

Commit

Permalink
Merge branch 'dev' into shahrokh_misc
Browse files Browse the repository at this point in the history
  • Loading branch information
shahrokhDaijavad committed Jan 13, 2025
2 parents 329b859 + 6e8a3f8 commit 8cd75b4
Show file tree
Hide file tree
Showing 30 changed files with 235 additions and 178 deletions.
2 changes: 1 addition & 1 deletion examples/kfp-pipelines/superworkflows/ray/kfp_v2/Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
REPOROOT=${CURDIR}/../../../..
REPOROOT=${CURDIR}/../../../../..
WORKFLOW_VENV_ACTIVATE=${REPOROOT}/transforms/venv/bin/activate
include $(REPOROOT)/transforms/.make.workflows

Expand Down
6 changes: 3 additions & 3 deletions examples/kfp-pipelines/superworkflows/ray/kfp_v2/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Chaining transforms using KFP V2

As in [super pipelines of KFP v1](../../../doc/multi_transform_pipeline.md), we want to offer an option of running a series of transforms one after the other on the data. But, in KFP v2 we can make it easier to chain transforms using the [nested pipelines](https://www.kubeflow.org/docs/components/pipelines/user-guides/components/compose-components-into-pipelines/#pipelines-as-components) that KFP v2 offers.
As in [super pipelines of KFP v1](../../../../../kfp/doc/multi_transform_pipeline.md), we want to offer an option of running a series of transforms one after the other on the data. But, in KFP v2 we can make it easier to chain transforms using the [nested pipelines](https://www.kubeflow.org/docs/components/pipelines/user-guides/components/compose-components-into-pipelines/#pipelines-as-components) that KFP v2 offers.

One example of chaining `noop` and `document id` transforms can be found [here](superpipeline_noop_docId_v2.py). When running this pipeline it appears as hierarchical graph with two nested pipelines, one for each transform as shown in the following screenshots.
One example of chaining `noop` and `document id` transforms can be found [here](superpipeline_noop_docId_v2_wf.py). When running this pipeline it appears as hierarchical graph with two nested pipelines, one for each transform as shown in the following screenshots.

`root` Layer
![nested_pipeline](nested_pipeline.png)
Expand All @@ -27,6 +27,6 @@ Another useful feature of the KFP v2 is the `Json` editor for the `dict` type in
cd examples/kfp/superworkflows/ray/kfp_v2/
make clean
export KFPv2=1
export PYTHONPATH=../../../../transforms
export PYTHONPATH=../../../../../transforms
make workflow-build
```
14 changes: 8 additions & 6 deletions kfp/kfp_ray_components/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@ ARG BASE_IMAGE=docker.io/rayproject/ray:2.36.1-py312

FROM ${BASE_IMAGE}

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

# install libraries
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
Expand All @@ -10,13 +15,13 @@ ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

COPY --chown=ray:users shared_workflow_support_lib shared_workflow_support_lib/
COPY --chmod=775 --chown=ray:root shared_workflow_support_lib shared_workflow_support_lib/
RUN cd shared_workflow_support_lib && pip install --no-cache-dir -e .

COPY --chown=ray:users workflow_support_lib workflow_support_lib/
COPY --chmod=775 --chown=ray:root workflow_support_lib workflow_support_lib/
RUN cd workflow_support_lib && pip install --no-cache-dir -e .

# overwriting the installation of old versions of pydantic
Expand All @@ -30,9 +35,6 @@ COPY ./src /pipelines/component/src
# Set environment
ENV KFP_v2=$KFP_v2

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

# Put these at the end since they seem to upset the docker cache.
ARG BUILD_DATE
ARG GIT_COMMIT
Expand Down
13 changes: 9 additions & 4 deletions tools/ingest2parquet/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,28 @@ ARG BASE_IMAGE=docker.io/rayproject/ray:2.24.0-py310

FROM ${BASE_IMAGE}

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

# install pytest
RUN pip install --no-cache-dir pytest
ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

RUN rm requirements.txt
# copy source
COPY --chown=ray:users ./src .
COPY --chmod=775 --chown=ray:root ./src .
# copy test
COPY --chown=ray:users test/ test/
COPY --chown=ray:users test-data/ test-data/
COPY --chmod=775 --chown=ray:root test/ test/
COPY --chmod=775 --chown=ray:root test-data/ test-data/
# Set environment
ENV PYTHONPATH /home/ray
14 changes: 8 additions & 6 deletions transforms/Dockerfile.ray.template
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
ARG BASE_IMAGE=docker.io/rayproject/ray:2.24.0-py310
FROM ${BASE_IMAGE}

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

RUN pip install --upgrade --no-cache-dir pip

# install pytest
Expand All @@ -10,17 +15,14 @@ ARG TRANSFORM_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]


COPY --chown=ray:users dpk_${TRANSFORM_NAME}/ dpk_${TRANSFORM_NAME}/
COPY --chown=ray:users requirements.txt requirements.txt
COPY --chmod=775 --chown=ray:root dpk_${TRANSFORM_NAME}/ dpk_${TRANSFORM_NAME}/
COPY --chmod=775 --chown=ray:root requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

# Set environment
ENV PYTHONPATH /home/ray

Expand Down
16 changes: 9 additions & 7 deletions transforms/code/code2parquet/ray/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@ ARG BASE_IMAGE=docker.io/rayproject/ray:2.24.0-py310

FROM ${BASE_IMAGE}

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

RUN pip install --upgrade --no-cache-dir pip

# install pytest
Expand All @@ -11,15 +16,15 @@ ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

COPY --chown=ray:users python-transform/ python-transform/
COPY --chmod=775 --chown=ray:root python-transform/ python-transform/
RUN cd python-transform && pip install --no-cache-dir -e .

# Install ray project source
COPY --chown=ray:users src/ src/
COPY --chown=ray:users pyproject.toml pyproject.toml
COPY --chmod=775 --chown=ray:root src/ src/
COPY --chmod=775 --chown=ray:root pyproject.toml pyproject.toml
RUN pip install --no-cache-dir -e .

# copy the main() entry point to the image
Expand All @@ -32,9 +37,6 @@ COPY src/code2parquet_local_ray.py local/
COPY test/ test/
COPY test-data/ test-data/

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

# Set environment
ENV PYTHONPATH /home/ray

Expand Down
14 changes: 8 additions & 6 deletions transforms/code/code_profiler/Dockerfile.ray
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@ ARG BASE_IMAGE=docker.io/rayproject/ray:2.24.0-py310

FROM ${BASE_IMAGE}

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

RUN pip install --upgrade --no-cache-dir pip

# install pytest
Expand All @@ -10,17 +15,14 @@ ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

## Copy the python version of the tansform
COPY --chown=ray:users dpk_code_profiler/ dpk_code_profiler/
COPY --chown=ray:users requirements.txt requirements.txt
COPY --chmod=775 --chown=ray:root dpk_code_profiler/ dpk_code_profiler/
COPY --chmod=775 --chown=ray:root requirements.txt requirements.txt
RUN pip install -r requirements.txt

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

# Set environment
ENV PYTHONPATH /home/ray

Expand Down
16 changes: 9 additions & 7 deletions transforms/code/code_quality/ray/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@ ARG BASE_IMAGE=docker.io/rayproject/ray:2.24.0-py310

FROM ${BASE_IMAGE}

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

RUN pip install --upgrade --no-cache-dir pip

# install pytest
Expand All @@ -14,17 +19,17 @@ ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

COPY --chown=ray:users python-transform/ python-transform/
COPY --chmod=775 --chown=ray:root python-transform/ python-transform/
RUN cd python-transform && pip install --no-cache-dir -e .

#COPY requirements.txt requirements.txt
#RUN pip install --no-cache-dir -r requirements.txt

COPY --chown=ray:users src/ src/
COPY --chown=ray:users pyproject.toml pyproject.toml
COPY --chmod=775 --chown=ray:root src/ src/
COPY --chmod=775 --chown=ray:root pyproject.toml pyproject.toml
RUN pip install --no-cache-dir -e .

# copy the main() entry point to the image
Expand All @@ -37,9 +42,6 @@ COPY ./src/code_quality_local_ray.py local/
COPY test/ test/
COPY test-data/ test-data/

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

# Set environment
ENV PYTHONPATH /home/ray

Expand Down
16 changes: 9 additions & 7 deletions transforms/code/header_cleanser/ray/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,20 +1,25 @@
FROM docker.io/rayproject/ray:2.24.0-py310

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

# install pytest
RUN pip install --no-cache-dir pytest

ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

COPY --chown=ray:users python-transform/ python-transform
COPY --chmod=775 --chown=ray:root python-transform/ python-transform
RUN cd python-transform && pip install --no-cache-dir -e .

COPY --chown=ray:users src/ src/
COPY --chown=ray:users pyproject.toml pyproject.toml
COPY --chmod=775 --chown=ray:root src/ src/
COPY --chmod=775 --chown=ray:root pyproject.toml pyproject.toml
RUN pip install --no-cache-dir -e .

# Install system dependencies, including libgomp1
Expand All @@ -32,9 +37,6 @@ COPY src/header_cleanser_local_ray.py local/
COPY test/ test/
COPY test-data/ test-data/

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

# Set environment
ENV PYTHONPATH /home/ray

Expand Down
18 changes: 10 additions & 8 deletions transforms/code/license_select/ray/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@ ARG BASE_IMAGE=docker.io/rayproject/ray:2.24.0-py310

FROM ${BASE_IMAGE}

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

RUN pip install --upgrade --no-cache-dir pip

# install pytest
Expand All @@ -10,15 +15,15 @@ ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

COPY --chown=ray:users python-transform/ python-transform/
COPY --chmod=775 --chown=ray:root python-transform/ python-transform/
RUN cd python-transform && pip install --no-cache-dir -e .

COPY --chown=ray:users src/ src/
COPY --chown=ray:users pyproject.toml pyproject.toml
COPY --chown=ray:users README.md README.md
COPY --chmod=775 --chown=ray:root src/ src/
COPY --chmod=775 --chown=ray:root pyproject.toml pyproject.toml
COPY --chmod=775 --chown=ray:root README.md README.md
RUN pip install --no-cache-dir -e .

# copy source data
Expand All @@ -29,9 +34,6 @@ COPY src/license_select_local_ray.py local/
COPY test/ test/
COPY test-data/ test-data/

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

# Put these at the end since they seem to upset the docker cache.
ARG BUILD_DATE
ARG GIT_COMMIT
Expand Down
16 changes: 9 additions & 7 deletions transforms/code/malware/ray/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@ ARG BASE_IMAGE=docker.io/rayproject/ray:2.24.0-py310

FROM ${BASE_IMAGE} AS base

# see https://docs.openshift.com/container-platform/4.17/openshift_images/create-images.html#use-uid_create-images
USER root
RUN chown ray:root /home/ray && chmod 775 /home/ray
USER ray

RUN pip install --upgrade --no-cache-dir pip

RUN pip install --no-cache-dir pytest
Expand Down Expand Up @@ -40,14 +45,14 @@ ARG DPK_WHEEL_FILE_NAME

# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-dist data-processing-dist
COPY --chmod=775 --chown=ray:root data-processing-dist data-processing-dist
RUN pip install data-processing-dist/${DPK_WHEEL_FILE_NAME}[ray]

COPY --chown=ray:users python-transform/ python-transform/
COPY --chmod=775 --chown=ray:root python-transform/ python-transform/
RUN cd python-transform && pip install --no-cache-dir -e .

COPY --chown=ray:users src/ src/
COPY --chown=ray:users pyproject.toml pyproject.toml
COPY --chmod=775 --chown=ray:root src/ src/
COPY --chmod=775 --chown=ray:root pyproject.toml pyproject.toml
RUN pip install --no-cache-dir -e .

# copy the main() entry point to the image
Expand All @@ -59,9 +64,6 @@ COPY src/malware_local_ray.py local/
COPY test/ test/
COPY test-data/ test-data/

# Grant non-root users the necessary permissions to the ray directory
RUN chmod 755 /home/ray

ENV PYTHONPATH /home/ray

USER root
Expand Down
Loading

0 comments on commit 8cd75b4

Please sign in to comment.