Skip to content

Uploading

Scott Kirkland edited this page Mar 24, 2021 · 6 revisions

Uploading Results

Once we have processed CSV files, we would like to upload the CSV data to our production database (PostgreSQL).

Assumption: Each CSV file is for a given county and year. ex: Lassen_2020.csv.

The following python3 script will extract the county & year from the given csv file and check our destination database to ensure that matching data has not already been uploaded. It will then proceed to use the psycopg2 library to call the pg COPY command to efficiently transfer data. Finally, it will copy the provided file to a folder indicating the process is complete.

import psycopg2
import sys
import shutil

if len(sys.argv) != 2:
    print('Useage: python script.py processed-file.csv')
    sys.exit()

path = sys.argv[1]

# first, let's read our file to determine which year and county we are dealing with
f = open(path, 'r')
# throw away header
f.readline()

row1 = f.readline()
cols = row1.split(',')

year = cols[2]
county = cols[13]

f.close()

#connect to the database
conn = psycopg2.connect(host='169.237.124.86',
                       dbname='cecdss',
                       user='cecdss',
                       password='pass',
                       port='5432')  
#create a cursor object 
#cursor object is used to interact with the database
cur = conn.cursor()

# now let's check that we don't already have a matching record in the db
existance_sql = "select year, county_name from treatedclustersfarm where year=%s and county_name=%s limit 1;"

cur.execute(existance_sql, (year,county))
results = cur.fetchone()

if results:
    print(f'There is already data present for {county} {year}. Exiting without making changes.')
else:
    print(f'Beginning data copy for {county} {year}.')
    copy_sql = "COPY treatedclustersfarm FROM STDIN DELIMITER ',' CSV HEADER"

    with open(path, 'r') as f:
        cur.copy_expert(copy_sql, f)
        conn.commit()
    print('Copy complete')
    
cur.close()
conn.close()

print('Data upload completed, moving file to completed directory')

shutil.move(path, f'/home/postit/pixels/uploaded/{county}-{year}.csv')

print('Moving complete, exiting')

Clone this wiki locally