Hacker Newsnew | past | comments | ask | show | jobs | submit | michaeldejong's commentslogin

Very cool! Congratulations to the authors on the release! I'm the author of a similar (zero-downtime migration) tool for PG called QuantumDB[0]. It was the first (to my knowledge at least) tool to support foreign keys, by creating table copies (keeping those in sync using triggers), and exposing multiple schemas through a custom database driver. I never got to production-ready version unfortunately, but I'm happy this one did. I'm seeing a lot of familiar concepts, and it looks well thought out.

[0] https://github.com/quantumdb/quantumdb


Actually both lock for many (crucial) schema operators, and often severely enough to block your application from reading from the table(s) under change. I've been researching this stuff for a while. Check out http://blog.minicom.nl/blog/2015/04/03/revisiting-profiling-... . It's slightly outdated, but still holds.


I felt the same way, so I've been working on QuantumDB for the last couple of years. Take a look at https://quantumdb.io . QuantumDB doesn't use the binlog / WAL log like gh-ost does, but it does support foreign key constraints, and it allows you to perform several schema operations in one go without having to deal with the intermediates. It's still not ready for production, but feel free to try it out. Feedback is welcome!


Researcher/author of a tool [0,1] also attempting to tackle this problem here.

Unfortunately zero-downtime schema changes are even more complex than suggested here. Although the expand-contract method as described in the post is a good approach to tackling this problem, the mere act of altering a database table that is in active use is a dangerous one. I've already found that some trivial operations such as adding a new column to an existing table can block database clients from reading from that table through full table locks for the duration of the schema operation [2].

In many cases it's safer to create a new table, copy data over from the old table to the new table, and switch clients over. However this introduces a whole new set of problems: keeping data in sync between tables, "fixing" foreign key constraints, etc.

If there are others researching/building tooling for this problem, I'd love to hear from you.

[0] http://github.com/quantumdb/quantumdb

[1] https://speakerdeck.com/michaeldejong/icse-17-zero-downtime-...

[2] http://blog.minicom.nl/blog/2015/04/03/revisiting-profiling-...


I agree that this is a very complicated problem that played a roled in the rise of NoSQL. Given it isn't the only reason for the rise of NoSQL but it contributed. This problem is worse when development teams and DBAs database do not report to the same person. Good luck convincing a DBA about your change. Disclaimer, I have the utmost respect for DBAs.


I agree with that. If your database doesn't care about a "schema", there usually are no operations available which change the schema of your data. Unfortunately that also means that if your database doesn't consider the schema its problem, when you do finally want to change the implicit schema of your data, it becomes the dev's problem.


Make the changes on a non-active replica database (assuming your database supports replication), then promote the replica to be primary. If you don’t have replicas, the concept of “zero downtime” is really at risk as your primary database becomes a single point of failure.


That's not great either. Moving the "master" role from one server to a replica isn't instantaneous (meaning you have a period of read-only state), and while your replica is performing the schema operation you still need to keep the data in sync with changes made on the master server (which to the best of my knowledge is non-trivial).

There are tools that "replay" data changes though: - FB's OSC (https://www.facebook.com/notes/mysql-at-facebook/online-sche...) - GH's gh-ost (https://github.com/github/gh-ost)


This is a more or less solved problem in MySQL with pt-online-schema-change [1] which accomplishes “create a new table ...” with the “copying data over” done for you using triggers and “switch clients over” done using a table rename, all within a tool with the longest full table lock held during the rename step. Some of this appears to be available in MySQL itself as of 5.6, but regardless all of this is fraught with edge cases.

[1] https://www.percona.com/doc/percona-toolkit/LATEST/pt-online...


Indeed. It seems there are many such tools available for MySQL: Openark kit [0] for instance, whose author also brought us gh-ost. Unfortunately most tools focus on just MySQL, and none really have an answer when you want to use (and enforce) foreign key constraints. Although it's hard to say, I suspect the users of these tools have "given up" on foreign key constraints.

[0] http://code.openark.org/forge/openark-kit [1] https://github.com/github/gh-ost


Significantly altering an active table without downtime is a problem that will not, and should not, ever be solved by a library. It's too fraught with peril and special case fun for any competent devops person to leave it up to external code.


The work that goes into "external code" is higher quality than most companies can achieve. The cost of developing a similar tool may also be outside what the company can afford to invest.

If your business is online schema changes, then inventing it yourself makes sense. Otherwise you are likely throwing money away to create an inferior product.


Agreed. But to be fair I've not yet encountered such a library/tool yet. All tools I've seen leave the active table alone, and simply create a new table, do all their work there, and finally somehow switch the two tables (either in naming or routing queries). Some even perform the work on a different database server.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: