Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

udpated sql for table comments #27

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

udpated sql for table comments #27

wants to merge 1 commit into from

Conversation

gregboyer
Copy link
Member

@gregboyer gregboyer commented Jul 22, 2017

I still suck at git. Gotta git add before I git commit! This has additional comments for all SBA related tables and columns

@gregboyer gregboyer requested a review from VincentLa14 July 22, 2017 22:36
"\n",
"\n",
";\n",
"--SQL Alchemy throws an error unless rows are returned. This works around the issue\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Yes, using pd.read_sql is not really the right way to execute general SQL statements. The point of pd.read_sql is to return a DataFrame object, so generally is used with select SQL statements. To execute SQL in sqlalchemy.

To execute SQL in sqlalchemy you can use connection.execute. See the docs for an example: http://docs.sqlalchemy.org/en/latest/core/connections.html.

In any case it's better that you added the comment stuff in the pipeline code. That way everything touching the database is in one place (and it's easy to recreate the state of the database at any point in time.

"source": [
"Get table comments. \n",
"QUERY = \"\"\"\n",
"COMMENT ON COLUMN data_ingest.sba__foia_504_1991_present.program IS 'Indicator of whether loan was approved under SBA''s 7(a) or 504 loan program'\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind removing these COMMENT ON statements now that you added the code in pipeline_runner?

@VincentLa14
Copy link
Member

VincentLa14 commented Jul 23, 2017

Would you mind making pipeline/pipeline_tasks/documentation/table_comments into a sql file? You can just rename it to table_comments.sql. This way we can just execute the SQL commands inside the file.

Also do you mind adding the table_comments.sql file to run within https://github.com/sfbrigade/datasci-sba/blob/master/pipeline/pipeline_runner.py.

You'll likely just want to add it as another element in the list: https://github.com/sfbrigade/datasci-sba/blob/master/pipeline/pipeline_runner.py#L100

@VincentLa14
Copy link
Member

hmm might need to be careful with merge conflicts now that @zlatankr has renamed pipeline/pipeline_tasks/documentation/table_comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants