Migrating Server Data to Cloud
cloud/gcs-migrate.py copies all user and project data from a self-hosted server instance to the GCP cloud deployment.
What It Migrates
- Database: all rows from the server PostgreSQL instance → Cloud SQL, via cloud-sql-proxy
- Files:
data/projects/anddata/embeds/→ GCS bucket ({project_id}-userdata)
Tables are migrated in foreign-key dependency order: subscriptions → users → folders → projects → embeds → analytics → page_views. Subscription rows (seed data) are inserted with ON CONFLICT DO NOTHING.
Prerequisites
Run on the server machine:
gcloud auth application-default login
gcloud components install cloud-sql-proxy
pip install psycopg2-binary google-cloud-storage google-cloud-secret-manager python-dotenv
The following config files must be populated:
config/server/.secrets.server—DATABASE_URLpointing to the server PostgreSQL instanceconfig/cloud/.env.cloud—GOOGLE_CLOUD_PROJECTandGCS_REGION
The cloud deployment must already be initialised (npm run gcs:init) so that Cloud SQL and Secret Manager are set up.
Usage
Dry run (no changes made — counts rows and files):
python cloud/gcs-migrate.py --dry-run
Full migration:
python cloud/gcs-migrate.py
Flags:
| Flag | Effect |
|---|---|
--dry-run | Count rows and files; make no changes |
--skip-db | Skip database migration; upload files only |
--skip-files | Skip file migration; copy database only |
Notes
- The cloud database is expected to be empty. Existing rows in non-seed tables are not deduplicated — run only once, or use
--skip-db/--skip-filesto re-run individual phases. - Foreign key constraints are disabled on the destination for the duration of the DB copy, then re-enabled. This is safe and handles the self-referential
folders.parent_idcolumn. - Application-layer KMS encryption is not yet implemented — files can be uploaded directly via this script.