-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
thanasornsawan
committed
Dec 19, 2024
1 parent
e0f39a2
commit 23e1d9f
Showing
12 changed files
with
574 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
name: ETL CI/CD Pipeline | ||
|
||
on: | ||
push: | ||
branches: | ||
- main | ||
pull_request: | ||
branches: | ||
- main | ||
|
||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
|
||
steps: | ||
- name: Checkout the code | ||
uses: actions/checkout@v2 | ||
|
||
- name: Set up Python | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.9 | ||
|
||
- name: Install Docker and Docker Compose | ||
run: | | ||
sudo apt-get update | ||
sudo apt-get install -y docker.io | ||
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose | ||
sudo chmod +x /usr/local/bin/docker-compose | ||
- name: Build and start Docker containers with docker-compose | ||
run: | | ||
docker-compose -f docker-compose.yml up -d | ||
sleep 10 # Wait for the DB to start properly (adjust if needed) | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -r requirements.txt | ||
- name: Run database setup | ||
run: python sql/sqlite_db/setup_db.py | ||
|
||
- name: Load data into database | ||
run: tests/load_data.py | ||
|
||
- name: Run tests | ||
run: | | ||
pytest tests/test_etl.py | ||
continue-on-error: true | ||
|
||
- name: Clean up Docker containers | ||
run: | | ||
docker-compose -f docker-compose.yml down |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
__pycache__ | ||
etl.db | ||
.venv | ||
.pytest_cache |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Dockerfile | ||
|
||
# Use a base Python image | ||
FROM python:3.9-slim | ||
|
||
# Set the working directory | ||
WORKDIR /app | ||
|
||
# Copy the requirements file into the container | ||
COPY requirements.txt . | ||
|
||
# Install the dependencies | ||
RUN pip install --no-cache-dir -r requirements.txt | ||
|
||
# Copy the project files into the container | ||
COPY . . | ||
|
||
# Set environment variables (if necessary) | ||
ENV DB_PATH="/opt/airflow/sqlite_db/etl.db" | ||
|
||
# Command to run when the container starts | ||
CMD ["pytest", "tests/test_etl.py"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
version: '3.8' | ||
|
||
services: | ||
sqlite_db: | ||
build: ./sql # Path where your Dockerfile is located | ||
container_name: sqlite_db | ||
volumes: | ||
- ./sql/sqlite_db:/opt/sqlite_db # Map the local folder to the container's folder | ||
ports: | ||
- "8081:8080" # Adjust if needed | ||
networks: | ||
- sqlite_network | ||
|
||
networks: | ||
sqlite_network: | ||
driver: bridge |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# For setting up the environment | ||
apache-airflow==2.5.0 # If using Airflow for orchestration | ||
pandas==1.5.3 # For handling data manipulation | ||
pytest==7.2.2 # For running tests | ||
openpyxl==3.0.10 # For reading and writing Excel files (e.g., orders_test_data.xlsx) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
FROM nouchka/sqlite3:latest | ||
|
||
# Set working directory to where the database will reside | ||
WORKDIR /opt/sqlite_db | ||
|
||
# Initialize or create the SQLite database | ||
RUN sqlite3 /opt/sqlite_db/etl.db "CREATE TABLE IF NOT EXISTS Orders (Order_ID INTEGER PRIMARY KEY AUTOINCREMENT, Product_Name TEXT, Quantity INTEGER);" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
# Query to Validate Customer_ID Uniqueness | ||
def validate_customer_id_unique(): | ||
return """ | ||
SELECT Customer_ID, Order_Date, COUNT(*) AS Order_Count | ||
FROM Orders | ||
GROUP BY Customer_ID, Order_Date | ||
HAVING COUNT(*) > 1 | ||
""" | ||
|
||
# Query to Validate Correct Date Format | ||
def validate_order_date_format(): | ||
return """ | ||
SELECT Order_ID, Order_Date | ||
FROM Orders | ||
WHERE Order_Date IS NULL | ||
OR NOT (Order_Date GLOB '????-??-??' | ||
AND LENGTH(Order_Date) = 10 | ||
AND CAST(substr(Order_Date, 1, 4) AS INTEGER) > 0 | ||
AND substr(Order_Date, 6, 2) BETWEEN '01' AND '12' | ||
AND CASE | ||
WHEN substr(Order_Date, 6, 2) IN ('01', '03', '05', '07', '08', '10', '12') THEN substr(Order_Date, 9, 2) BETWEEN '01' AND '31' | ||
WHEN substr(Order_Date, 6, 2) IN ('04', '06', '09', '11') THEN substr(Order_Date, 9, 2) BETWEEN '01' AND '30' | ||
WHEN substr(Order_Date, 6, 2) = '02' THEN ( | ||
CASE | ||
WHEN (CAST(substr(Order_Date, 1, 4) AS INTEGER) % 4 = 0 | ||
AND CAST(substr(Order_Date, 1, 4) AS INTEGER) % 100 != 0) | ||
OR CAST(substr(Order_Date, 1, 4) AS INTEGER) % 400 = 0 THEN substr(Order_Date, 9, 2) BETWEEN '01' AND '29' | ||
ELSE substr(Order_Date, 9, 2) BETWEEN '01' AND '28' | ||
END | ||
) | ||
ELSE 0 | ||
END = 1 | ||
); | ||
""" | ||
|
||
# Query to find orders with negative quantities | ||
def get_orders_with_negative_quantity(): | ||
return """ | ||
SELECT Order_ID, Customer_ID, Product_ID, Quantity | ||
FROM Orders | ||
WHERE Quantity < 0 | ||
""" | ||
|
||
# Query to find orders with missing Customer_Name | ||
def get_orders_with_missing_customer_name(): | ||
return """ | ||
SELECT Order_ID, Customer_ID, Customer_Name, Product_ID, Quantity | ||
FROM Orders | ||
WHERE Customer_Name IS NULL | ||
""" | ||
|
||
# Query to ensure unique Product_ID (no duplicates allowed in Orders) | ||
def get_orders_with_duplicate_product_id(): | ||
return """ | ||
SELECT Product_ID, COUNT(*) | ||
FROM Orders | ||
GROUP BY Product_ID | ||
HAVING COUNT(*) > 1 | ||
""" | ||
|
||
# Query to ensure Product_Name cannot be NULL in Products | ||
def get_orders_with_null_product_name(): | ||
return """ | ||
SELECT * | ||
FROM Products | ||
WHERE Product_Name IS NULL | ||
""" | ||
|
||
# Query to get email customer in Orders | ||
def get_invalid_email_customers(): | ||
""" | ||
Query to find customers with invalid email format. | ||
Returns rows where the email does not match the expected pattern. | ||
""" | ||
query = """ | ||
SELECT * | ||
FROM Orders | ||
WHERE Email NOT LIKE '%_@__%.__%'; | ||
""" | ||
return query | ||
|
||
def get_orders_with_invalid_date_range(): | ||
""" | ||
Query to find orders where the Order_Date is outside the range '2024-01-01' to '2024-12-31'. | ||
""" | ||
query = """ | ||
SELECT * | ||
FROM Orders | ||
WHERE Order_Date < '2024-01-01' OR Order_Date > '2024-12-31'; | ||
""" | ||
return query | ||
|
||
def get_invalid_product_references(): | ||
""" | ||
Returns the SQL query to check for invalid Product_ID references in the Orders table. | ||
""" | ||
return """ | ||
SELECT o.Order_ID, o.Product_ID | ||
FROM Orders o | ||
LEFT JOIN Products p ON o.Product_ID = p.Product_ID | ||
WHERE p.Product_ID IS NULL; | ||
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
import sqlite3 | ||
|
||
# Path to SQLite database | ||
DB_PATH = 'sql/sqlite_db/etl.db' | ||
|
||
# Establish a connection | ||
conn = sqlite3.connect(DB_PATH) | ||
cursor = conn.cursor() | ||
|
||
# Drop tables if they exist to ensure schema updates | ||
cursor.execute('DROP TABLE IF EXISTS Orders;') | ||
cursor.execute('DROP TABLE IF EXISTS Products;') | ||
|
||
# Create the Orders table with the updated schema (including Email column) | ||
cursor.execute(''' | ||
CREATE TABLE Orders ( | ||
Order_ID INTEGER PRIMARY KEY, | ||
Customer_ID INTEGER, | ||
Customer_Name TEXT, | ||
Order_Date TEXT, | ||
Product_ID INTEGER, | ||
Quantity INTEGER, | ||
Email TEXT | ||
); | ||
''') | ||
|
||
# Create the Products table | ||
cursor.execute(''' | ||
CREATE TABLE Products ( | ||
Product_ID INTEGER PRIMARY KEY, | ||
Product_Name TEXT | ||
); | ||
''') | ||
|
||
# Commit changes and close the connection | ||
conn.commit() | ||
conn.close() | ||
|
||
print("Database and tables set up successfully.") |
Oops, something went wrong.