This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Joe Auty
How can we sync two databases, both copying all initial data, and then finding new and updated subsequent records and delta copying these records?
The Automated (and Free) Solution
Redactics will do this running in your own infrastructure, and is free for developer usage. It is a pure SQL solution so is extremely lightweight. Delta updates are so fast you can consider this as support near realtime replication, and fast enough to run every few minutes or so for many use cases.
The Homemade Solution
If you want to do this yourself, here is an approach that works well:
- Ensure that each of your tables you wish to sync have numerical primary keys, as most SQL-based databases do (typically auto-incrementing).
- Then, considering that new primary keys will always be increasing in size, for identifying new rows you can just create SQL queries to search for records with larger primary key values than your last primary key.
- For updates, ensure that your master tables have an updated date field that is updated each time the record is updated. Then, you can use SQL to search for last updated dates newer than your most recent update. It is okay if you need to search for dates a little older than the present time because there generally is no harm in re-applying the same update to existing data.
- If you want to write new data to your target database, you'll need a strategy to prevent conflicting primary keys. An approach we like is creating a new column called something like
source_primary_key
, generating CSVs (without headers) using the above two approaches (one CSV for new rows, one CSV for updates), and then piping this CSV into awk to remove the primary key column, and move its value to your source_primary_key column. For example, if your primary key column is your first column and your source_primary_key your fifth column:cat /path/to/csv.csv | gawk -vOFS=, -F "," "{print $2,$3,$4,$5,$1}" | [your SQL client]
- For row deletions we generally recommend soft deleting records, which would be treated as an update, since setting the value of a
disabled
field to true would also update this row's last updated date.
Please let me know if you have any luck with either approach!
This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Joe Auty

Joe Auty | Sciencx (2022-09-21T23:56:37+00:00) How To Sync Data Between Two Databases. Retrieved from https://www.scien.cx/2022/09/21/how-to-sync-data-between-two-databases/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.