Skip to content

Commit

Permalink
refactored distribution handling adding Flatcar and Fedora support (#17)
Browse files Browse the repository at this point in the history
* refactoring crawler

- split Ubuntu handling from updater/service.py as first to make code more readable
- note: debugging code still in place

Signed-off-by: Christian Otto Stelter <[email protected]>

* refactoring crawler

    - split Debian handling from updater/service.py
    - split Alma Linux handling from updater/service.py

Signed-off-by: Christian Otto Stelter <[email protected]>

* Added support for crawling Flatcar Container Linux

Signed-off-by: Christian Otto Stelter <[email protected]>

* - removed no longer used release_update_check from updater/service.py
- added error message for unsupported distributions
- updated changelog
- updated README.md with supported distributions

Signed-off-by: Christian Otto Stelter <[email protected]>

* - make use of loguru for simple configuration of logging
- AlmaLinux crawling not yet fully functional again

Signed-off-by: Christian Otto Stelter <[email protected]>

* - AlmaLinux crawling improved - it does not simply fetches the first hit - and working again
- extended requirements.txt for loguru

Signed-off-by: Christian Otto Stelter <[email protected]>

* - added Debian 12 aka bookworm to image-sources.yaml
- fixed debug output for checksums

Signed-off-by: Christian Otto Stelter <[email protected]>

* - added first try on Fedora
- template not yet finished
- last checksum query must be adjusted for distributions like Fedora
- exporter must be adjusted

Signed-off-by: Christian Otto Stelter <[email protected]>

* - adjusted checksum query for Fedora - could be more generic for Distributions with no "minor" release updates
- adjusted database queries - get distribution_version needed for Fedora
- updated template for export

Signed-off-by: Christian Otto Stelter <[email protected]>

* - added --debug as argument
- added first try on Fedora Linux Support to crawler
- added Debian Linux 12 aka bookworm to sample image-sources.yaml

Signed-off-by: Christian Otto Stelter <[email protected]>

* - removed old branch from Dockerfile

Signed-off-by: Christian Otto Stelter <[email protected]>

---------

Signed-off-by: Christian Otto Stelter <[email protected]>
  • Loading branch information
stelterlab authored Jun 12, 2023
1 parent 3cd65b1 commit dc62071
Show file tree
Hide file tree
Showing 18 changed files with 1,013 additions and 120 deletions.
10 changes: 10 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Changelog

## 2023-06-11
- refactoring of updater parts - split into single files for each distribution
- removed generic release_update_check from updater/service.py
- added support Flatcar Container Linux
- updated README.md
- added loguru for easy configuration of logging
- added optional debug output (via loglevel)
- added --debug as argument
- added first try on Fedora Linux Support to crawler
- added Debian Linux 12 aka bookworm to sample image-sources.yaml

## 2023-06-01
- updated example Dockerfile to new repos
Expand Down
15 changes: 13 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@

OpenStack Image Crawler for checking image sources, gathering update information and generating image catalog files for the [OpenStack Image Manager](https://github.com/osism/openstack-image-manager) (or similiar tools).

Supported distributions:

- Ubuntu Linux
- Debian Linux
- AlmaLinux
- Flatcar Container Linux
- Fedora Linux

Note: Flatcar Container Linux offers only zipped images, so a direct upload via OpenStack Image Manager/Glance is not supported (yet).

## Requirements
### Git repository for holding the image catalogs (optional)

Expand All @@ -13,7 +23,7 @@ If there is no remote_repository entry in the config, the git actions are disabl

## Installation

Tested on Ubuntu 20.04 LTS + 22.04 LTS (should work with Python 3.8+ on other OSs, too)
Tested on Ubuntu 20.04 LTS + 22.04 LTS. Should work with Python 3.8+ on other OSs, too. Optional: build the docker container.

```
sudo apt install git python3-pip python3-venv
Expand Down Expand Up @@ -43,7 +53,7 @@ Usage:
```
./image-crawler.py -h
plusserver Image Crawler v0.1
plusserver Image Crawler v0.4.0
usage: image-crawler.py [-h] [--config CONFIG] [--sources SOURCES] [--init-db] [--export-only] [--updates-only]
Expand All @@ -56,6 +66,7 @@ optional arguments:
--init-db initialize image catalog database
--export-only export only existing image catalog
--updates-only check only for updates, do not export catalog
--debug give more output for debugging
```

### Helper: Historian
Expand Down
8 changes: 5 additions & 3 deletions crawler/core/config.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import yaml

from loguru import logger
from pathlib import Path


Expand All @@ -10,12 +12,12 @@ def config_read(name, msg="config"):
try:
config = yaml.safe_load(Path(name).read_text())
except PermissionError:
print("ERROR: could not open config - please check file permissions")
logger.error("could not open config - please check file permissions")
return None
except yaml.YAMLError as error:
print("ERROR: %s" % error)
logger.error(error)
return None

print("Successfully read %s from %s" % (msg, name))
logger.info("Successfully read %s from %s" % (msg, name))

return config
85 changes: 63 additions & 22 deletions crawler/core/database.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import sys
import sqlite3

from loguru import logger
from pathlib import Path


Expand All @@ -9,7 +11,7 @@ def database_connect(name):
try:
connection = sqlite3.connect(name)
except sqlite3.OperationalError as error:
print("ERROR: %s" % error)
logger.error(error)
return None
return connection
else:
Expand All @@ -23,7 +25,7 @@ def database_disconnect(connection):
def database_initialize(name, prog_dirname):
path = Path(name)
if path.is_file():
print("WARNING: database %s already exists. Cowardly refusing action." % name)
logger.warning("database %s already exists. Cowardly refusing action." % name)
else:
create_statement_fqfn = prog_dirname + "/lib/initialize-image-catalog.sql"
create_statement_file_path = Path(create_statement_fqfn)
Expand All @@ -32,21 +34,45 @@ def database_initialize(name, prog_dirname):
create_statement = db_init_file.read()
db_init_file.close()
else:
raise SystemError("Template initialize-image-catalog.sql not found")
logger.error("Template initialize-image-catalog.sql not found")
raise SystemExit(1)

try:
connection = sqlite3.connect(name)
except sqlite3.OperationalError as error:
print("ERROR: %s" % error)
logger.error(error)
database_cursor = connection.cursor()
try:
database_cursor.execute(create_statement)
except Exception as error:
print('ERROR: create table failed with the following error "%s"' % error)
logger.error('create table failed with the following error "%s"' % error)

connection.close()
print("New database created under %s" % name)
logger.info("New database created under %s" % name)

def db_get_last_checksum_fedora(connection, distribution):
try:
database_cursor = connection.cursor()
database_cursor.execute(
"SELECT checksum FROM image_catalog "
"WHERE distribution_name = '%s' "
"ORDER BY id DESC LIMIT 1" % distribution
)
except sqlite3.OperationalError as error:
logger.error(error)
raise SystemExit(1)

row = database_cursor.fetchone()

if row is None:
logger.debug("no previous entries found")
last_checksum = "sha256:none"
else:
last_checksum = row[0]

database_cursor.close()

return last_checksum

def db_get_last_checksum(connection, distribution, release):
try:
Expand All @@ -58,12 +84,13 @@ def db_get_last_checksum(connection, distribution, release):
"ORDER BY id DESC LIMIT 1" % (distribution, release)
)
except sqlite3.OperationalError as error:
raise SystemError("SQLite error: %s" % error)
logger.error(error)
raise SystemExit(1)

row = database_cursor.fetchone()

if row is None:
# print("no previous entries found")
logger.debug("no previous entries found")
last_checksum = "sha256:none"
else:
last_checksum = row[0]
Expand All @@ -87,7 +114,7 @@ def db_get_release_versions(connection, distribution, release, limit):
"ORDER BY id DESC LIMIT %d" % (distribution, release, limit)
)
except sqlite3.OperationalError as error:
print("SQLite error: %s" % error)
logger.error(error)
sys.exit(1)
row = database_cursor.fetchone()

Expand Down Expand Up @@ -119,7 +146,8 @@ def read_version_from_catalog(connection, distribution, release, version):
"ORDER BY ID" % (distribution, release, version)
)
except sqlite3.OperationalError as error:
raise SystemError("SQLite error: %s" % error)
logger.error(error)
raise SystemExit(1)

image_catalog = {}
image_catalog["versions"] = {}
Expand Down Expand Up @@ -154,7 +182,8 @@ def write_catalog_entry(connection, update):
)
connection.commit()
except sqlite3.OperationalError as error:
raise SystemError("SQLite error: %s" % error)
logger.error(error)
raise SystemExit(1)

database_cursor.close()

Expand All @@ -170,7 +199,8 @@ def update_catalog_entry(connection, update):
)
connection.commit()
except sqlite3.OperationalError as error:
raise SystemError("SQLite error: %s" % error)
logger.error(error)
raise SystemExit(1)

database_cursor.close()

Expand All @@ -186,7 +216,7 @@ def write_or_update_catalog_entry(connection, update):
)

if update["version"] in existing_entry["versions"]:
print("Updating version " + update["version"])
logger.info("Updating version " + update["version"])
return update_catalog_entry(connection, update)
else:
return write_catalog_entry(connection, update)
Expand All @@ -195,16 +225,26 @@ def write_or_update_catalog_entry(connection, update):
def read_release_from_catalog(connection, distribution, release, limit):
try:
database_cursor = connection.cursor()
database_cursor.execute(
"SELECT version,checksum,url,release_date "
"FROM (SELECT * FROM image_catalog "
"WHERE distribution_name = '%s' "
"AND distribution_release = '%s' "
"ORDER BY id DESC LIMIT %d) "
"ORDER BY ID" % (distribution, release, limit)
)
if release == "all":
database_cursor.execute(
"SELECT version,checksum,url,release_date,distribution_release "
"FROM (SELECT * FROM image_catalog "
"WHERE distribution_name = '%s' "
"ORDER BY id DESC LIMIT %d) "
"ORDER BY ID" % (distribution, limit)
)
else:
database_cursor.execute(
"SELECT version,checksum,url,release_date,distribution_release "
"FROM (SELECT * FROM image_catalog "
"WHERE distribution_name = '%s' "
"AND distribution_release = '%s' "
"ORDER BY id DESC LIMIT %d) "
"ORDER BY ID" % (distribution, release, limit)
)
except sqlite3.OperationalError as error:
raise SystemError("SQLite error: %s" % error)
logger.error(error)
raise SystemExit(1)

image_catalog = {}
image_catalog["versions"] = {}
Expand All @@ -215,5 +255,6 @@ def read_release_from_catalog(connection, distribution, release, limit):
image_catalog["versions"][version]["checksum"] = image[1]
image_catalog["versions"][version]["url"] = image[2]
image_catalog["versions"][version]["release_date"] = image[3]
image_catalog["versions"][version]["distribution_release"] = image[4]

return image_catalog
20 changes: 12 additions & 8 deletions crawler/core/exporter.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import os

from crawler.core.database import read_release_from_catalog
from loguru import logger


def export_image_catalog(
Expand All @@ -11,18 +12,19 @@ def export_image_catalog(
# create directory (once) - only necessary when not created by git clone
if not os.path.exists(local_repository):
try:
print("Creating repository directory (%s)" % local_repository)
logger.info("Creating repository directory (%s)" % local_repository)
os.makedirs(local_repository)
except os.error as error:
raise SystemExit(
"FATAL: Creating directory %s failed with %s"
logger.error(
"Creating directory %s failed with %s"
% (local_repository, error)
)
raise SystemExit(1)

for source in sources_catalog["sources"]:
if source["name"] in updated_sources:
distribution = source["name"]
print("Exporting image catalog for " + distribution)
logger.info("Exporting image catalog for " + distribution)
header_file = open(template_path + "/header.yml")
catalog_export = header_file.read()
header_file.close()
Expand All @@ -44,6 +46,7 @@ def export_image_catalog(
limit = release["limit"]
else:
limit = 3

release_catalog = read_release_from_catalog(
connection, distribution, release["name"], limit
)
Expand Down Expand Up @@ -74,17 +77,18 @@ def export_image_catalog_all(
# create directory (once) - only necessary when not created by git clone
if not os.path.exists(local_repository):
try:
print("Creating repository directory (%s)" % local_repository)
logger.info("Creating repository directory (%s)" % local_repository)
os.makedirs(local_repository)
except os.error as error:
raise SystemExit(
"FATAL: Creating directory %s failed with %s"
logger.error(
"Creating directory %s failed with %s"
% (local_repository, error)
)
raise SystemExit(1)

for source in sources_catalog["sources"]:
distribution = source["name"]
print("Exporting image catalog for " + distribution)
logger.info("Exporting image catalog for " + distribution)
header_file = open(template_path + "/header.yml")
catalog_export = header_file.read()
header_file.close()
Expand Down
3 changes: 2 additions & 1 deletion crawler/core/main.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
from crawler.updater.service import image_update_service
from loguru import logger


def crawl_image_sources(image_source_catalog, database):

updated_sources = {}
for source in image_source_catalog["sources"]:
print("\nChecking updates for Distribution " + source["name"])
logger.info("Checking updates for Distribution " + source["name"])
updated_releases = image_update_service(database, source)
if updated_releases:
updated_sources[source["name"]] = {}
Expand Down
Loading

0 comments on commit dc62071

Please sign in to comment.