Archive Rails database migrations #243

Holmes98 · 2024-01-05T04:53:02Z

This modifies schema.rb to match the production database, and merges all existing migrations into a single migration.
The new migration contains the contents of schema.rb, and also this execute statement which isn't supported by the current schema.rb.

See c979d38 for a comparison of SQL schema dumps (the "migrate" dumps were taken after db:migrate:reset, while the "schema" dumps were taken after db:schema:load). Note that there are still a few differences between the production DB and this branch; those could be addressed separately.

coveralls · 2024-01-05T11:55:31Z

coverage: 37.131%. remained the same
when pulling 95957e4 on archive-migrations
into 57b299d on master.

tom93

I assume this will be merged after #225 and the related changes?

We should link to some page explaining the idea, I think stackoverflow is the source; maybe also collectiveidea.com or naturaily.com?

Nit: I think it would be nicer to use the oldest existing migration ID (20110819224757) as the ID for load_schema, rather than the newest ID. (When we archive more migrations we'll keep the migration ID, so the only consistent approach is to pick the oldest.)

(Dare we go even further and assign a brand new ID? E.g. 00000000000000_load_schema.rb. Advantage is that it stands out and makes it easier to archive migrations -- mv db/migrate/202* db/archive. Disadvantage is that we have to use some trick to mark the migration as "up" on prod without actually running it. If we do this, we should definitely add a check that aborts/skips the migration if the tables already exist -- should be enough to just check for one well-known table.)

I think we'll also need to go over a DB dump and look for any data created by migrations (and move it to db/seeds.rb). See Discord for a relevant comment.

I saw you had def change earlier but replaced it with def up, I assume the only issue is the CREATE INDEX. Do we want to make it reversible?

I'm happy to commit & push my suggestions.

tom93 · 2024-01-05T13:40:22Z

db/migrate/20200418113601_load_schema.rb

+    # These are extensions that must be enabled in order to support this database
+    enable_extension "plpgsql"
+


Nit: Should we delete these lines? (Just saying this because migrations don't usually call enable_extension)

Not sure but I'd prefer to leave it unless it's causing issues. Note that the squasher gem also includes it.

tom93 · 2024-01-05T14:31:49Z

db/migrate/20200418113601_load_schema.rb

+    create_table "contest_relations", force: :cascade do |t|
+      t.integer  "user_id"


Nit: It would be nice if these lines were naturally indented the same as in schema.rb (so they can be copy pasted directly without re-indenting). How about this hack:

class LoadSchema < ActiveRecord::Migration def up # Load the schema from a snapshot of db/schema.rb (see below) load_schema # Statements that aren't supported by db/schema.rb execute "CREATE UNIQUE INDEX index_users_on_username ON users (lower(username))" end def down ... end end # Snapshot of db/schema.rb. # We define this instance method outside of the class, so that its body will # have the same indentation as the original code from db/schema.rb. LoadSchema.send(:define_method, :load_schema) do create_table "contest_relations", force: :cascade do |t| ... end end

I don't think it's a big deal; indenting is trivial, and you can use diff -b to ignore whitespace changes (to verify that it matches). On the other hand, I do like that execute statement is visually separated from the rest of the schema. Feel free to change it.

tom93 · 2024-01-05T14:41:52Z

db/archive/20110819224757_devise_create_users.rb

Nit: Can we please pick a different directory structure? I know "archive" is suggested on stackoverflow, but I'd prefer e.g. db/archived_migrations (from another answer on same page).

Holmes98

Nit: I think it would be nicer to use the oldest existing migration ID (20110819224757) as the ID for load_schema, rather than the newest ID. (When we archive more migrations we'll keep the migration ID, so the only consistent approach is to pick the oldest.)

I think the newest ID is preferable, because:

It indicates that the migration "contains" all the other archived migrations up to that particular ID.
Most examples I've seen seem to use the newest ID.
If anyone has an existing db setup that isn't fully migrated up to this version, running db:migrate will appear to succeed, but it won't perform this migration, which could cause confusion. This would have the same issue in the future if reuse we use a separate ID. If we use the newest ID, the migration will run for anyone that isn't up to date; note that this will cause existing tables to be dropped – I think this is acceptable, otherwise we could maybe add something like this (maybe also safer for prod):

if ActiveRecord::Migrator.current_version > 0
  raise StandardError, "The database schema is out of date; running this migration will cause irreversible data loss!"
end

Note that it's trivial to rename whenever we want to archive more migrations, and I think git should preserve file history as long as the schema doesn't change too much.

I think we'll also need to go over a DB dump and look for any data created by migrations (and move it to db/seeds.rb). See Discord for a relevant comment.

Good point, I've checked and the only case was the system mailer settings. I'll create a separate PR for that; would like to squash this one.

I saw you had def change earlier but replaced it with def up, I assume the only issue is the CREATE INDEX. Do we want to make it reversible?

I don't think so; rollbacking this migration would cause all tables to be dropped, which probably isn't desirable (at least on prod). It already raised an error when attempting to rollback under def change, but I changed to be explicit, just to be safe.

Holmes98 · 2024-01-05T22:57:42Z

db/archive/20110819224757_devise_create_users.rb

Holmes98 · 2024-01-05T22:58:07Z

db/migrate/20200418113601_load_schema.rb

+    create_table "contest_relations", force: :cascade do |t|
+      t.integer  "user_id"


I don't think it's a big deal; indenting is trivial, and you can use diff -b to ignore whitespace changes (to verify that it matches). On the other hand, I do like that execute statement is visually separated from the rest of the schema. Feel free to change it.

Holmes98 · 2024-01-05T22:58:28Z

db/migrate/20200418113601_load_schema.rb

+    # These are extensions that must be enabled in order to support this database
+    enable_extension "plpgsql"
+


Not sure but I'd prefer to leave it unless it's causing issues. Note that the squasher gem also includes it.

bagedevimo · 2024-01-06T00:43:53Z

Two questions:

Not that i'm against this, but what's the goal - be good to have a "why" in a commit message
Why keep the archived migrations? They're in Git? Not sure we'll ever need the files?

Holmes98 · 2024-01-06T01:21:18Z

The main goal is to resync schema.rb with the production database; it is currently desynced due to differences in limits that were discussed in #225. Archiving the migrations provides other benefits, but also makes this easier because we can (mostly) just insert the production schema into a new migration.

Why keep the archived migrations? They're in Git? Not sure we'll ever need the files?

I think it's more convenient to have all the archived migrations in master, so they can be searched through/referenced easily. If we remove them and then later add more migrations, and then squash/remove them again, we'll end up in a state where you can't easily view the full migration history without checking out multiple commits.

These are currently created during migrations, but we plan to remove this behaviour in #243. The values are left blank, as they need to be set manually anyway.

bagedevimo · 2024-01-06T02:00:46Z

The main goal is to resync schema.rb with the production database; it is currently desynced due to differences in limits that were discussed in #225. Archiving the migrations provides other benefits, but also makes this easier because we can (mostly) just insert the production schema into a new migration.

I think I'd prefer to fix the outdated migrations than have a mega-migration - also though missing migrations don't cause an error as such, it is a a bit weird. And fixing the older migrations to have limits is not a big job.

I think it's more convenient to have all the archived migrations in master, so they can be searched through/referenced easily. If we remove them and then later add more migrations, and then squash/remove them again, we'll end up in a state where you can't easily view the full migration history without checking out multiple commits.

I think if it's more convenient to keep the migrations around, then why bother having the archive / rollup.

I'm not very committed to this particular topic, but I don't really see the point in both having a rollup migration AND keeping the legacy migrations around. Very happy to be outvoted though :)

Holmes98 · 2024-01-06T02:26:25Z

I think I'd prefer to fix the outdated migrations than have a mega-migration - also though missing migrations don't cause an error as such, it is a a bit weird. And fixing the older migrations to have limits is not a big job.

It's documented in the rails guide, so I don't really have an issue with it. Fixing the older migrations isn't necessarily difficult but I think it's inconvenient to have to do it every time the defaults change.

I think if it's more convenient to keep the migrations around, then why bother having the archive / rollup.

I'm not really fussed about this; would be okay with deleting them, but the way I see it, it's somewhat similar to commenting code. Sure, you could just go through the git history, but it's more convenient to have it in an easily accessible location. Would also be fine with just keeping them in a separate branch.

bagedevimo · 2024-01-06T02:30:49Z

It's documented in the rails guide, so I don't really have an issue with it. Fixing the older migrations isn't necessarily difficult but I think it's inconvenient to have to do it every time the defaults change.

TIL it's documented behavior.. FWIW, they did fix this in Rails 5 - migrations have a version from Rails 5 onwards, and they apply the defaults from their version. https://www.bigbinary.com/blog/migrations-are-versioned-in-rails-5

I'm not really fussed about this; would be okay with deleting them, but the way I see it, it's somewhat similar to commenting code. Sure, you could just go through the git history, but it's more convenient to have it in an easily accessible location. Would also be fine with just keeping them in a separate branch.

Haha, interesting, I'm also in camp "just delete the code, don't comment it unless you know you want existing code with work with the commented code and it's coming back real soon". I'm not really that fussed either. Branch is fine with me. Keeping them is fine-enough, I don't care very deeply about this so happy for Tom and you to make a call. Just wanted to drop some thoughts.

Holmes98 · 2024-01-06T02:38:16Z

FWIW, they did fix this in Rails 5 - migrations have a version from Rails 5 onwards, and they apply the defaults from their version.

Good point, I wasn't aware of that. In that case, I don't mind if we just fix the older migrations.

tom93 · 2024-01-07T05:18:35Z

The main goal is to resync schema.rb with the production database

Oh, I misunderstood. I don't really like using the archive process to change the schema, it feels too sneaky. I'd prefer to explicitly add limit: 255 to the old migrations (the implementation work is done -- branch schema-limit-255).

So if we want the final database to have limit: 255, my preferred process would be:

Explicitly add limit: 255 to the old migrations
(Optional) Archive the old migrations

If we want the final database to not have limit: 255 (which is my preference), the process would be:

Make sure we are still enforcing sensible limits in the model validators (we don't want 1,000-character usernames)
(Optional) Explicitly add limit: 255 to the old migrations
Add a new migration that changes the column types to remove the limit
~~(Optional) If we want, we can later revert commits 1 and/or 2 (they are not required after prod is migrated)~~
(Optional) Archive the old migrations

For the finer points, here are my preferences and reasoning:

[weak] Remove limit: 255, for the same reasons Rails removed the limits and for consistency with new columns
[weak] Archive old migrations (using the load_schema trick), because maintaining old migrations can be a pain. Rail's versioned migrations should make this easier, but we also had other issues (e.g. because migrations used models) and the solutions are awkward.
Keep the archived migrations in a directory (don't delete them), because they may be useful (e.g. as examples to draw upon) and having to look through the git history makes it harder.

P.S. Re missing migrations and "NO FILE", we can get rid of those lines using a fairly simple hack (manually remove those versions from the schema_migrations table). But I think it's more "truthful" to keep those lines.

Holmes98 · 2024-01-07T05:43:24Z

Oh, I misunderstood. I don't really like using the archive process to change the schema, it feels too sneaky.

I'm okay with it because the schema/migrations already "changed" inadvertently when we updated from Rails 4.1 to 4.2, so this is really just changing it back to what it was before, to match production.

Anyway, I basically agree with the rest of your points, except that I have no particular preference on whether we should remove the limits or not. It would be nice to be follow the defaults and be consistent, but if migrations are versioned from Rails 5+ then that's basically the opposite (unless we do a similar thing every time the defaults change, which seems like a pain).

In either case, I think we should explicitly add limit: 255 to the current schema first, since that's what's currently on production.

tom93 · 2024-01-07T08:45:27Z

If we do use the archived migration to make the change, then I'd like it to be done as two separate and well-documented commits (happy to do that myself): 1. Modify db/schema.rb to match prod, then 2. Archive db/schema.rb.

Re changing defaults and the future, I feel versioned migrations give us the option to either keep the old behaviour or manually switch to the new default (we can decide on a case-by-case basis depending on effort and so on).

Since it looks like nobody has a strong preference re limit: 255, shall we just do a vote?
👍 - restore limit: 255 (Rails <= 4.1 behaviour)
👎 - remove limit: 255 (Rails >= 4.2 behaviour)
(Or I could first put together a draft PR to see what a migration to remove limit: 255 from prod might look like)

Holmes98 · 2024-01-07T08:57:25Z

If we do use the archived migration to make the change, then I'd like it to be done as two separate and well-documented commits (happy to do that myself): 1. Modify db/schema.rb to match prod, then 2. Archive db/schema.rb.

Sure. To clarify, I'm also fine with using your schema-limit-255 branch if you prefer that.

Feel free to just create a separate PR to remove the limits afterwards; I don't have any objections.

(Optional) If we want, we can later revert commits 1 and/or 2 (they are not required after prod is migrated)

Also, I don't think there's much point in doing this; if we want to clean up the migrations we may as well just archive everything.

tom93 · 2024-01-07T09:05:47Z

So can I propose the following:

Explicitly add limit: 255 to the old migrations (branch schema-limit-255)
[subject to vote] Add a new migration that changes the column types to remove the limit
Archive the old migrations

(If we do 2, then I prefer to do it before 3, because it's silly for the schema snapshot to be littered with limit: 255 only to have the next migration remove them.)

Holmes98 · 2024-01-07T09:07:21Z

Sounds good to me.

Archive Rails database migrations

cbf5331

Holmes98 force-pushed the archive-migrations branch from ef505aa to cbf5331 Compare January 5, 2024 11:52

Disable lint on migration files

cc47c61

tom93 reviewed Jan 5, 2024

View reviewed changes

Holmes98 commented Jan 5, 2024

View reviewed changes

Merge branch 'master' into archive-migrations

95957e4

Holmes98 added a commit that referenced this pull request Jan 6, 2024

Add system/mailer settings to db/seeds.rb

6d2bd1b

These are currently created during migrations, but we plan to remove this behaviour in #243. The values are left blank, as they need to be set manually anyway.

Holmes98 mentioned this pull request Jan 6, 2024

Add system/mailer settings to db/seeds.rb #254

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Archive Rails database migrations #243

Archive Rails database migrations #243

Holmes98 commented Jan 5, 2024 •

edited

Loading

coveralls commented Jan 5, 2024 •

edited

Loading

tom93 left a comment •

edited

Loading

tom93 Jan 5, 2024 •

edited

Loading

Holmes98 Jan 5, 2024

tom93 Jan 5, 2024 •

edited

Loading

Holmes98 Jan 5, 2024

tom93 Jan 5, 2024 •

edited

Loading

Holmes98 Jan 5, 2024

Holmes98 left a comment

Holmes98 Jan 5, 2024

Holmes98 Jan 5, 2024

Holmes98 Jan 5, 2024

bagedevimo commented Jan 6, 2024

Holmes98 commented Jan 6, 2024 •

edited

Loading

bagedevimo commented Jan 6, 2024

Holmes98 commented Jan 6, 2024

bagedevimo commented Jan 6, 2024

Holmes98 commented Jan 6, 2024

tom93 commented Jan 7, 2024 •

edited

Loading

Holmes98 commented Jan 7, 2024

tom93 commented Jan 7, 2024 •

edited

Loading

Holmes98 commented Jan 7, 2024

tom93 commented Jan 7, 2024 •

edited

Loading

Holmes98 commented Jan 7, 2024

		# These are extensions that must be enabled in order to support this database
		enable_extension "plpgsql"

		create_table "contest_relations", force: :cascade do \|t\|
		t.integer "user_id"

Archive Rails database migrations #243

Are you sure you want to change the base?

Archive Rails database migrations #243

Conversation

Holmes98 commented Jan 5, 2024 • edited Loading

coveralls commented Jan 5, 2024 • edited Loading

tom93 left a comment • edited Loading

Choose a reason for hiding this comment

tom93 Jan 5, 2024 • edited Loading

Choose a reason for hiding this comment

Holmes98 Jan 5, 2024

Choose a reason for hiding this comment

tom93 Jan 5, 2024 • edited Loading

Choose a reason for hiding this comment

Holmes98 Jan 5, 2024

Choose a reason for hiding this comment

tom93 Jan 5, 2024 • edited Loading

Choose a reason for hiding this comment

Holmes98 Jan 5, 2024

Choose a reason for hiding this comment

Holmes98 left a comment

Choose a reason for hiding this comment

Holmes98 Jan 5, 2024

Choose a reason for hiding this comment

Holmes98 Jan 5, 2024

Choose a reason for hiding this comment

Holmes98 Jan 5, 2024

Choose a reason for hiding this comment

bagedevimo commented Jan 6, 2024

Holmes98 commented Jan 6, 2024 • edited Loading

bagedevimo commented Jan 6, 2024

Holmes98 commented Jan 6, 2024

bagedevimo commented Jan 6, 2024

Holmes98 commented Jan 6, 2024

tom93 commented Jan 7, 2024 • edited Loading

Holmes98 commented Jan 7, 2024

tom93 commented Jan 7, 2024 • edited Loading

Holmes98 commented Jan 7, 2024

tom93 commented Jan 7, 2024 • edited Loading

Holmes98 commented Jan 7, 2024

Holmes98 commented Jan 5, 2024 •

edited

Loading

coveralls commented Jan 5, 2024 •

edited

Loading

tom93 left a comment •

edited

Loading

tom93 Jan 5, 2024 •

edited

Loading

tom93 Jan 5, 2024 •

edited

Loading

tom93 Jan 5, 2024 •

edited

Loading

Holmes98 commented Jan 6, 2024 •

edited

Loading

tom93 commented Jan 7, 2024 •

edited

Loading

tom93 commented Jan 7, 2024 •

edited

Loading

tom93 commented Jan 7, 2024 •

edited

Loading