Skip to content

Commit

Permalink
Merge branch 'database'
Browse files Browse the repository at this point in the history
- Integrated event sourcing with PostgreSQL for tracking biomero
  workflows and tasks.
- Added database views for job progress, workflow statistics, and
    task-to-job mapping.
- Enhanced testing with in-memory SQLite support and custom mocks
      for event sourcing.
- Migrated to a SQLAlchemy backend with scoped sessions and
	improved configurability.
- Improved logging, documentation, and test coverage for new and
	  existing components.
  • Loading branch information
TorecLuik committed Dec 3, 2024
2 parents eed9fd3 + baf5ae3 commit e1494ed
Show file tree
Hide file tree
Showing 14 changed files with 3,884 additions and 158 deletions.
8 changes: 6 additions & 2 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.7", "3.9", "3.10"]
python-version: ["3.8", "3.9", "3.10"]

steps:
- uses: actions/checkout@v4
Expand All @@ -36,4 +36,8 @@ jobs:
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
python -m pytest
python -m pytest --cov=biomero --cov-report=xml
- name: Coveralls GitHub Action
uses: coverallsapp/[email protected]


2 changes: 1 addition & 1 deletion .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.7'
python-version: '3.8'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
5 changes: 5 additions & 0 deletions .github/workflows/sphinx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@ jobs:
- uses: actions/checkout@v4
- name: Build HTML
uses: ammaraskar/sphinx-action@master
with:
pre-build-command: |
# Install necessary dependencies
apt-get update --allow-releaseinfo-change -y && apt-get install -y gcc python3-dev libpq-dev postgresql-client
pg_config --version
env:
SETUPTOOLS_SCM_PRETEND_VERSION: 1
- name: Upload artifacts
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# BIOMERO - BioImage analysis in OMERO
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![DOI](https://zenodo.org/badge/638954891.svg)](https://zenodo.org/badge/latestdoi/638954891) [![PyPI - Version](https://img.shields.io/pypi/v/biomero)](https://pypi.org/project/biomero/) [![PyPI - Python Versions](https://img.shields.io/pypi/pyversions/biomero)](https://pypi.org/project/biomero/) ![Slurm](https://img.shields.io/badge/Slurm-21.08.6-blue.svg) ![OMERO](https://img.shields.io/badge/OMERO-5.6.8-blue.svg) [![fair-software.eu](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green)](https://fair-software.eu) [![OpenSSF Best Practices](https://bestpractices.coreinfrastructure.org/projects/7530/badge)](https://bestpractices.coreinfrastructure.org/projects/7530) [![Sphinx build](https://github.com/NL-BioImaging/biomero/actions/workflows/sphinx.yml/badge.svg?branch=main)](https://github.com/NL-BioImaging/biomero/actions/workflows/sphinx.yml) [![pages-build-deployment](https://github.com/NL-BioImaging/biomero/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/NL-BioImaging/biomero/actions/workflows/pages/pages-build-deployment) [![python-package build](https://github.com/NL-BioImaging/biomero/actions/workflows/python-package.yml/badge.svg)](https://github.com/NL-BioImaging/biomero/actions/workflows/python-package.yml) [![python-publish build](https://github.com/NL-BioImaging/biomero/actions/workflows/python-publish.yml/badge.svg?branch=main)](https://github.com/NL-BioImaging/biomero/actions/workflows/python-publish.yml)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![DOI](https://zenodo.org/badge/638954891.svg)](https://zenodo.org/badge/latestdoi/638954891) [![PyPI - Version](https://img.shields.io/pypi/v/biomero)](https://pypi.org/project/biomero/) [![PyPI - Python Versions](https://img.shields.io/pypi/pyversions/biomero)](https://pypi.org/project/biomero/) ![Slurm](https://img.shields.io/badge/Slurm-21.08.6-blue.svg) ![OMERO](https://img.shields.io/badge/OMERO-5.6.8-blue.svg) [![fair-software.eu](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green)](https://fair-software.eu) [![OpenSSF Best Practices](https://bestpractices.coreinfrastructure.org/projects/7530/badge)](https://bestpractices.coreinfrastructure.org/projects/7530) [![Sphinx build](https://github.com/NL-BioImaging/biomero/actions/workflows/sphinx.yml/badge.svg?branch=main)](https://github.com/NL-BioImaging/biomero/actions/workflows/sphinx.yml) [![pages-build-deployment](https://github.com/NL-BioImaging/biomero/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/NL-BioImaging/biomero/actions/workflows/pages/pages-build-deployment) [![python-package build](https://github.com/NL-BioImaging/biomero/actions/workflows/python-package.yml/badge.svg)](https://github.com/NL-BioImaging/biomero/actions/workflows/python-package.yml) [![python-publish build](https://github.com/NL-BioImaging/biomero/actions/workflows/python-publish.yml/badge.svg?branch=main)](https://github.com/NL-BioImaging/biomero/actions/workflows/python-publish.yml) [![Coverage Status](https://coveralls.io/repos/github/NL-BioImaging/biomero/badge.svg?branch=main)](https://coveralls.io/github/NL-BioImaging/biomero?branch=main)

The **BIOMERO** framework, for **B**io**I**mage analysis in **OMERO**, allows you to run (FAIR) bioimage analysis workflows directly from OMERO on a high-performance compute (HPC) cluster, remotely through SSH.

Expand Down Expand Up @@ -64,7 +64,7 @@ Your Slurm cluster/login node needs to have:
Your OMERO _processing_ node needs to have:
1. SSH client and access to the Slurm cluster (w/ private key / headless)
2. SCP access to the Slurm cluster
3. Python3.7+
3. Python3.8+
4. This library installed
- Latest release on PyPI `python3 -m pip install biomero`
- or latest Github version `python3 -m pip install 'git+https://github.com/NL-BioImaging/biomero'`
Expand Down
19 changes: 7 additions & 12 deletions biomero/__init__.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,9 @@
from .slurm_client import SlurmClient

import importlib.metadata
try:
import importlib.metadata
try:
__version__ = importlib.metadata.version(__package__)
except importlib.metadata.PackageNotFoundError:
__version__ = "Version not found"
except ModuleNotFoundError: # Python 3.7
try:
import pkg_resources
__version__ = pkg_resources.get_distribution(__package__).version
except pkg_resources.DistributionNotFound:
__version__ = "Version not found"
__version__ = importlib.metadata.version(__package__)
except importlib.metadata.PackageNotFoundError:
__version__ = "Version not found"

from .eventsourcing import *
from .views import *
14 changes: 13 additions & 1 deletion biomero/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@

IMAGE_EXPORT_SCRIPT = "_SLURM_Image_Transfer.py"
IMAGE_IMPORT_SCRIPT = "SLURM_Get_Results.py"
CONVERSION_SCRIPT = "SLURM_Remote_Conversion.py"
RUN_WF_SCRIPT = "SLURM_Run_Workflow.py"


Expand Down Expand Up @@ -106,4 +107,15 @@ class transfer:
FORMAT_OMETIFF = 'OME-TIFF'
FORMAT_ZARR = 'ZARR'
FOLDER = "Folder_Name"
FOLDER_DEFAULT = 'SLURM_IMAGES_'
FOLDER_DEFAULT = 'SLURM_IMAGES_'


class workflow_status:
INITIALIZING = "INITIALIZING"
TRANSFERRING = "TRANSFERRING"
CONVERTING = "CONVERTING"
RETRIEVING = "RETRIEVING"
DONE = "DONE"
FAILED = "FAILED"
RUNNING = "RUNNING"
JOB_STATUS = "JOB_"
198 changes: 198 additions & 0 deletions biomero/database.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
# -*- coding: utf-8 -*-
# Copyright 2024 Torec Luik
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from eventsourcing.utils import get_topic, clear_topic_cache
import logging
from sqlalchemy import create_engine, text, Column, Integer, String, URL, DateTime, Float
from sqlalchemy.orm import sessionmaker, declarative_base, scoped_session
from sqlalchemy.dialects.postgresql import UUID as PGUUID
import os

logger = logging.getLogger(__name__)

# --------------------- VIEWS DB tables/classes ---------------------------- #

# Base class for declarative class definitions
Base = declarative_base()


class JobView(Base):
"""
SQLAlchemy model for the 'biomero_job_view' table.
Attributes:
slurm_job_id (Integer): The unique identifier for the Slurm job.
user (Integer): The ID of the user who submitted the job.
group (Integer): The group ID associated with the job.
task_id (UUID): The unique identifier for the biomero task
"""
__tablename__ = 'biomero_job_view'

slurm_job_id = Column(Integer, primary_key=True)
user = Column(Integer, nullable=False)
group = Column(Integer, nullable=False)
task_id = Column(PGUUID(as_uuid=True))


class JobProgressView(Base):
"""
SQLAlchemy model for the 'biomero_job_progress_view' table.
Attributes:
slurm_job_id (Integer): The unique identifier for the Slurm job.
status (String): The current status of the Slurm job.
progress (String, optional): The progress status of the Slurm job.
"""
__tablename__ = 'biomero_job_progress_view'

slurm_job_id = Column(Integer, primary_key=True)
status = Column(String, nullable=False)
progress = Column(String, nullable=True)


class WorkflowProgressView(Base):
"""
SQLAlchemy model for the 'workflow_progress_view' table.
Attributes:
workflow_id (PGUUID): The unique identifier for the workflow (primary key).
status (String, optional): The current status of the workflow.
progress (String, optional): The progress status of the workflow.
user (String, optional): The user who initiated the workflow.
group (String, optional): The group associated with the workflow.
name (String, optional): The name of the workflow
"""
__tablename__ = 'biomero_workflow_progress_view'

workflow_id = Column(PGUUID(as_uuid=True), primary_key=True)
status = Column(String, nullable=True)
progress = Column(String, nullable=True)
user = Column(Integer, nullable=True)
group = Column(Integer, nullable=True)
name = Column(String, nullable=True)
task = Column(String, nullable=True)
start_time = Column(DateTime, nullable=False)


class TaskExecution(Base):
"""
SQLAlchemy model for the 'biomero_task_execution' table.
Attributes:
task_id (PGUUID): The unique identifier for the task.
task_name (String): The name of the task.
task_version (String): The version of the task.
user_id (Integer, optional): The ID of the user who initiated the task.
group_id (Integer, optional): The group ID associated with the task.
status (String): The current status of the task.
start_time (DateTime): The time when the task started.
end_time (DateTime, optional): The time when the task ended.
error_type (String, optional): Type of error encountered during execution, if any.
"""
__tablename__ = 'biomero_task_execution'

task_id = Column(PGUUID(as_uuid=True), primary_key=True)
task_name = Column(String, nullable=False)
task_version = Column(String)
user_id = Column(Integer, nullable=True)
group_id = Column(Integer, nullable=True)
status = Column(String, nullable=False)
start_time = Column(DateTime, nullable=False)
end_time = Column(DateTime, nullable=True)
error_type = Column(String, nullable=True)


class EngineManager:
"""
Manages the SQLAlchemy engine and session lifecycle.
Class Attributes:
_engine: The SQLAlchemy engine used to connect to the database.
_scoped_session_topic: The topic of the scoped session.
_session: The scoped session used for database operations.
"""
_engine = None
_scoped_session_topic = None
_session = None

@classmethod
def create_scoped_session(cls, sqlalchemy_url: str = None):
"""
Creates and returns a scoped session for interacting with the database.
If the engine doesn't already exist, it initializes the SQLAlchemy engine
and sets up the scoped session.
Args:
sqlalchemy_url (str, optional): The SQLAlchemy database URL. If not provided,
the method will retrieve the value from the 'SQLALCHEMY_URL' environment variable.
Returns:
str: The topic of the scoped session adapter class.
"""
if cls._engine is None:
# Note, we only allow sqlalchemy eventsourcing module
if not sqlalchemy_url:
sqlalchemy_url = os.getenv('SQLALCHEMY_URL')
cls._engine = create_engine(sqlalchemy_url)

# setup tables if they don't exist yet
Base.metadata.create_all(cls._engine)

# Create a scoped_session object.
cls._session = scoped_session(
sessionmaker(autocommit=False, autoflush=True, bind=cls._engine)
)

class MyScopedSessionAdapter:
def __getattribute__(self, item: str) -> None:
return getattr(cls._session, item)

# Produce the topic of the scoped session adapter class.
cls._scoped_session_topic = get_topic(MyScopedSessionAdapter)

return cls._scoped_session_topic

@classmethod
def get_session(cls):
"""
Retrieves the current scoped session.
Returns:
Session: The SQLAlchemy session for interacting with the database.
"""
return cls._session()

@classmethod
def commit(cls):
"""
Commits the current transaction in the scoped session.
"""
cls._session.commit()

@classmethod
def close_engine(cls):
"""
Closes the database engine and cleans up the session.
This method disposes of the SQLAlchemy engine, removes the session,
and resets all associated class attributes to `None`.
"""
if cls._engine is not None:
cls._session.remove()
cls._engine.dispose()
cls._engine = None
cls._session = None
cls._scoped_session_topic = None
clear_topic_cache()
Loading

0 comments on commit e1494ed

Please sign in to comment.