MIGRATE_TO_CLOUD

Migrating Server Data to Cloud

cloud/gcs-migrate.py copies all user and project data from a self-hosted server instance to the GCP cloud deployment.

What It Migrates

  • Database: all rows from the server PostgreSQL instance → Cloud SQL, via cloud-sql-proxy
  • Files: data/projects/ and data/embeds/ → GCS bucket ({project_id}-userdata)

Tables are migrated in foreign-key dependency order: subscriptions → users → folders → projects → embeds → analytics → page_views. Subscription rows (seed data) are inserted with ON CONFLICT DO NOTHING.

Prerequisites

Run on the server machine:

gcloud auth application-default login
gcloud components install cloud-sql-proxy
pip install psycopg2-binary google-cloud-storage google-cloud-secret-manager python-dotenv

The following config files must be populated:

  • config/server/.secrets.serverDATABASE_URL pointing to the server PostgreSQL instance
  • config/cloud/.env.cloudGOOGLE_CLOUD_PROJECT and GCS_REGION

The cloud deployment must already be initialised (npm run gcs:init) so that Cloud SQL and Secret Manager are set up.

Usage

Dry run (no changes made — counts rows and files):

python cloud/gcs-migrate.py --dry-run

Full migration:

python cloud/gcs-migrate.py

Flags:

FlagEffect
--dry-runCount rows and files; make no changes
--skip-dbSkip database migration; upload files only
--skip-filesSkip file migration; copy database only

Notes

  • The cloud database is expected to be empty. Existing rows in non-seed tables are not deduplicated — run only once, or use --skip-db / --skip-files to re-run individual phases.
  • Foreign key constraints are disabled on the destination for the duration of the DB copy, then re-enabled. This is safe and handles the self-referential folders.parent_id column.
  • Application-layer KMS encryption is not yet implemented — files can be uploaded directly via this script.