Skip to content

Commit

Permalink
Merge pull request #428 from stellar/master
Browse files Browse the repository at this point in the history
[PRODUCTION] Update production Airflow environment
  • Loading branch information
harsha-stellar-data authored Jul 12, 2024
2 parents cae5ca8 + 1475901 commit 7fce5eb
Show file tree
Hide file tree
Showing 9 changed files with 441 additions and 390 deletions.
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ This repository contains the Airflow DAGs for the [Stellar ETL](https://github.c
- [build_time_task](#build_time_task)
- [build_export_task](#build_export_task)
- [build_gcs_to_bq_task](#build_gcs_to_bq_task)
- [build_del_ins_from_gcs_to_bq_task](#build_del_ins_from_gcs_to_bq_task)
- [build_apply_gcs_changes_to_bq_task](#build_apply_gcs_changes_to_bq_task)
- [build_batch_stats](#build_batch_stats)
- [bq_insert_job_task](#bq_insert_job_task)
Expand Down Expand Up @@ -543,6 +544,7 @@ This section contains information about the Airflow setup. It includes our DAG d
- [build_export_task](#build_export_task)
- [build_gcs_to_bq_task](#build_gcs_to_bq_task)
- [build_apply_gcs_changes_to_bq_task](#build_apply_gcs_changes_to_bq_task)
- [build_del_ins_from_gcs_to_bq_task](#build_del_ins_from_gcs_to_bq_task)
- [build_batch_stats](#build_batch_stats)
- [bq_insert_job_task](#bq_insert_job_task)
- [cross_dependency_task](#cross_dependency_task)
Expand Down Expand Up @@ -669,6 +671,10 @@ This section contains information about the Airflow setup. It includes our DAG d

[This file](https://github.com/stellar/stellar-etl-airflow/blob/master/dags/stellar_etl_airflow/build_gcs_to_bq_task.py) contains methods for creating tasks that appends information from a Google Cloud Storage file to a BigQuery table. These tasks will create a new table if one does not exist. These tasks are used for history archive data structures, as Stellar wants to keep a complete record of the ledger's entire history.

### **build_del_ins_from_gcs_to_bq_task**

[This file](https://github.com/stellar/stellar-etl-airflow/blob/master/dags/stellar_etl_airflow/build_del_ins_from_gcs_to_bq_task.py) contains methods for deleting data from a specified BigQuery table according to the batch interval and also imports data from gcs to the corresponding BigQuery table. These tasks will create a new table if one does not exist. These tasks are used for history and state data structures, as Stellar wants to keep a complete record of the ledger's entire history.

### **build_apply_gcs_changes_to_bq_task**

[This file](https://github.com/stellar/stellar-etl-airflow/blob/master/dags/stellar_etl_airflow/build_apply_gcs_changes_to_bq_task.py) contains methods for creating apply tasks. Apply tasks are used to merge a file from Google Cloud Storage into a BigQuery table. Apply tasks differ from the other task that appends in that they apply changes. This means that they update, delete, and insert rows. These tasks are used for accounts, offers, and trustlines, as the BigQuery table represents the point in time state of these data structures. This means that, for example, a merge task could alter the account balance field in the table if a user performed a transaction, delete a row in the table if a user deleted their account, or add a new row if a new account was created.
Expand Down
2 changes: 2 additions & 0 deletions airflow_variables_dev.json
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,7 @@
"asset_stats": 720,
"build_batch_stats": 840,
"build_bq_insert_job": 1080,
"build_del_ins_from_gcs_to_bq_task": 2000,
"build_delete_data_task": 1020,
"build_export_task": 840,
"build_gcs_to_bq_task": 960,
Expand Down Expand Up @@ -373,6 +374,7 @@
"build_bq_insert_job": 180,
"build_copy_table": 180,
"build_dbt_task": 960,
"build_del_ins_from_gcs_to_bq_task": 400,
"build_delete_data_task": 180,
"build_export_task": 420,
"build_gcs_to_bq_task": 300,
Expand Down
2 changes: 2 additions & 0 deletions airflow_variables_prod.json
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,7 @@
"asset_stats": 420,
"build_batch_stats": 600,
"build_bq_insert_job": 840,
"build_del_ins_from_gcs_to_bq_task": 2000,
"build_delete_data_task": 780,
"build_export_task": 600,
"build_gcs_to_bq_task": 660,
Expand Down Expand Up @@ -371,6 +372,7 @@
"build_bq_insert_job": 180,
"build_copy_table": 180,
"build_dbt_task": 1800,
"build_del_ins_from_gcs_to_bq_task": 400,
"build_delete_data_task": 180,
"build_export_task": 300,
"build_gcs_to_bq_task": 300,
Expand Down
Loading

0 comments on commit 7fce5eb

Please sign in to comment.