-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Archive Rails database migrations #243
base: master
Are you sure you want to change the base?
Conversation
ef505aa
to
cbf5331
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this will be merged after #225 and the related changes?
We should link to some page explaining the idea, I think stackoverflow is the source; maybe also collectiveidea.com or naturaily.com?
Nit: I think it would be nicer to use the oldest existing migration ID (20110819224757) as the ID for load_schema, rather than the newest ID. (When we archive more migrations we'll keep the migration ID, so the only consistent approach is to pick the oldest.)
(Dare we go even further and assign a brand new ID? E.g. 00000000000000_load_schema.rb. Advantage is that it stands out and makes it easier to archive migrations -- mv db/migrate/202* db/archive
. Disadvantage is that we have to use some trick to mark the migration as "up" on prod without actually running it. If we do this, we should definitely add a check that aborts/skips the migration if the tables already exist -- should be enough to just check for one well-known table.)
I think we'll also need to go over a DB dump and look for any data created by migrations (and move it to db/seeds.rb). See Discord for a relevant comment.
I saw you had def change
earlier but replaced it with def up
, I assume the only issue is the CREATE INDEX. Do we want to make it reversible?
I'm happy to commit & push my suggestions.
# These are extensions that must be enabled in order to support this database | ||
enable_extension "plpgsql" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Should we delete these lines? (Just saying this because migrations don't usually call enable_extension
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure but I'd prefer to leave it unless it's causing issues. Note that the squasher gem also includes it.
create_table "contest_relations", force: :cascade do |t| | ||
t.integer "user_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It would be nice if these lines were naturally indented the same as in schema.rb (so they can be copy pasted directly without re-indenting). How about this hack:
class LoadSchema < ActiveRecord::Migration
def up
# Load the schema from a snapshot of db/schema.rb (see below)
load_schema
# Statements that aren't supported by db/schema.rb
execute "CREATE UNIQUE INDEX index_users_on_username ON users (lower(username))"
end
def down
...
end
end
# Snapshot of db/schema.rb.
# We define this instance method outside of the class, so that its body will
# have the same indentation as the original code from db/schema.rb.
LoadSchema.send(:define_method, :load_schema) do
create_table "contest_relations", force: :cascade do |t|
...
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a big deal; indenting is trivial, and you can use diff -b
to ignore whitespace changes (to verify that it matches). On the other hand, I do like that execute statement is visually separated from the rest of the schema. Feel free to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we please pick a different directory structure? I know "archive" is suggested on stackoverflow, but I'd prefer e.g. db/archived_migrations (from another answer on same page).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think it would be nicer to use the oldest existing migration ID (20110819224757) as the ID for load_schema, rather than the newest ID. (When we archive more migrations we'll keep the migration ID, so the only consistent approach is to pick the oldest.)
I think the newest ID is preferable, because:
- It indicates that the migration "contains" all the other archived migrations up to that particular ID.
- Most examples I've seen seem to use the newest ID.
- If anyone has an existing db setup that isn't fully migrated up to this version, running
db:migrate
will appear to succeed, but it won't perform this migration, which could cause confusion. This would have the same issue in the future if reuse we use a separate ID. If we use the newest ID, the migration will run for anyone that isn't up to date; note that this will cause existing tables to be dropped – I think this is acceptable, otherwise we could maybe add something like this (maybe also safer for prod):
if ActiveRecord::Migrator.current_version > 0
raise StandardError, "The database schema is out of date; running this migration will cause irreversible data loss!"
end
Note that it's trivial to rename whenever we want to archive more migrations, and I think git should preserve file history as long as the schema doesn't change too much.
I think we'll also need to go over a DB dump and look for any data created by migrations (and move it to db/seeds.rb). See Discord for a relevant comment.
Good point, I've checked and the only case was the system mailer settings. I'll create a separate PR for that; would like to squash this one.
I saw you had
def change
earlier but replaced it withdef up
, I assume the only issue is the CREATE INDEX. Do we want to make it reversible?
I don't think so; rollbacking this migration would cause all tables to be dropped, which probably isn't desirable (at least on prod). It already raised an error when attempting to rollback under def change
, but I changed to be explicit, just to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
create_table "contest_relations", force: :cascade do |t| | ||
t.integer "user_id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a big deal; indenting is trivial, and you can use diff -b
to ignore whitespace changes (to verify that it matches). On the other hand, I do like that execute statement is visually separated from the rest of the schema. Feel free to change it.
# These are extensions that must be enabled in order to support this database | ||
enable_extension "plpgsql" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure but I'd prefer to leave it unless it's causing issues. Note that the squasher gem also includes it.
Two questions:
|
The main goal is to resync schema.rb with the production database; it is currently desynced due to differences in limits that were discussed in #225. Archiving the migrations provides other benefits, but also makes this easier because we can (mostly) just insert the production schema into a new migration.
I think it's more convenient to have all the archived migrations in master, so they can be searched through/referenced easily. If we remove them and then later add more migrations, and then squash/remove them again, we'll end up in a state where you can't easily view the full migration history without checking out multiple commits. |
These are currently created during migrations, but we plan to remove this behaviour in #243. The values are left blank, as they need to be set manually anyway.
I think I'd prefer to fix the outdated migrations than have a mega-migration - also though missing migrations don't cause an error as such, it is a a bit weird. And fixing the older migrations to have limits is not a big job.
I think if it's more convenient to keep the migrations around, then why bother having the archive / rollup. I'm not very committed to this particular topic, but I don't really see the point in both having a rollup migration AND keeping the legacy migrations around. Very happy to be outvoted though :) |
It's documented in the rails guide, so I don't really have an issue with it. Fixing the older migrations isn't necessarily difficult but I think it's inconvenient to have to do it every time the defaults change.
I'm not really fussed about this; would be okay with deleting them, but the way I see it, it's somewhat similar to commenting code. Sure, you could just go through the git history, but it's more convenient to have it in an easily accessible location. Would also be fine with just keeping them in a separate branch. |
TIL it's documented behavior.. FWIW, they did fix this in Rails 5 - migrations have a version from Rails 5 onwards, and they apply the defaults from their version. https://www.bigbinary.com/blog/migrations-are-versioned-in-rails-5
Haha, interesting, I'm also in camp "just delete the code, don't comment it unless you know you want existing code with work with the commented code and it's coming back real soon". I'm not really that fussed either. Branch is fine with me. Keeping them is fine-enough, I don't care very deeply about this so happy for Tom and you to make a call. Just wanted to drop some thoughts. |
Good point, I wasn't aware of that. In that case, I don't mind if we just fix the older migrations. |
Oh, I misunderstood. I don't really like using the archive process to change the schema, it feels too sneaky. I'd prefer to explicitly add So if we want the final database to have
If we want the final database to not have
For the finer points, here are my preferences and reasoning:
P.S. Re missing migrations and "NO FILE", we can get rid of those lines using a fairly simple hack (manually remove those versions from the schema_migrations table). But I think it's more "truthful" to keep those lines. |
I'm okay with it because the schema/migrations already "changed" inadvertently when we updated from Rails 4.1 to 4.2, so this is really just changing it back to what it was before, to match production. Anyway, I basically agree with the rest of your points, except that I have no particular preference on whether we should remove the limits or not. It would be nice to be follow the defaults and be consistent, but if migrations are versioned from Rails 5+ then that's basically the opposite (unless we do a similar thing every time the defaults change, which seems like a pain). In either case, I think we should explicitly add |
If we do use the archived migration to make the change, then I'd like it to be done as two separate and well-documented commits (happy to do that myself): 1. Modify db/schema.rb to match prod, then 2. Archive db/schema.rb. Re changing defaults and the future, I feel versioned migrations give us the option to either keep the old behaviour or manually switch to the new default (we can decide on a case-by-case basis depending on effort and so on). Since it looks like nobody has a strong preference re |
Sure. To clarify, I'm also fine with using your Feel free to just create a separate PR to remove the limits afterwards; I don't have any objections.
Also, I don't think there's much point in doing this; if we want to clean up the migrations we may as well just archive everything. |
So can I propose the following:
(If we do 2, then I prefer to do it before 3, because it's silly for the schema snapshot to be littered with |
Sounds good to me. |
This modifies schema.rb to match the production database, and merges all existing migrations into a single migration.
The new migration contains the contents of schema.rb, and also this execute statement which isn't supported by the current schema.rb.
See c979d38 for a comparison of SQL schema dumps (the "migrate" dumps were taken after
db:migrate:reset
, while the "schema" dumps were taken afterdb:schema:load
). Note that there are still a few differences between the production DB and this branch; those could be addressed separately.