v0.5
Version 0.5 focuses on cloud deployment and sharing, as well as providing better user experience by optimizing for mobile, adding interface customization, and many other UX tweaks. It also includes a large number of security improvements and fixes.
Highlights
Workspace Panel Adjustment
- Workspace panels (Waveform, Speakers, Transcript), can now be collapsed and expanded
- The focus/unfocus button will collapse/expand all the other panels
- Panels expand to fill space vacated by collapsed panels
- Double-clicking on a panel header will also focus/unfocus
Google Cloud Integration
- Projects can now be deployed to Google Cloud Run with audio files stored in Google Cloud Storage
- A new storage abstraction layer means local and cloud deployments share the same codebase — no code changes needed to switch backends
- Secrets (API keys, tokens) are pulled automatically from GCP Secret Manager when running in cloud mode
- Scheduled background jobs are handled by Cloud Scheduler via secured internal HTTP endpoints
- Config files are now organised into
local/,server/, andcloud/subfolders matching the three supported deploy modes
Sharing Overhaul
- Projects and folders can now be shared with specific users by name or email
- Shared users are assigned a role: Owner, Editor, or Viewer
- Owners can transfer ownership to another user
- New sharers receive an optional email notification
- Avatar stacks in the sidebar show who a project or folder is shared with at a glance
- "Anyone with link" toggle now works for both projects and folders
Mobile Optimization
- All frontend pages and the workspace are now optimized for mobile screen sizes
- Media transforms ensure layout, panels, and controls adapt responsively to small viewports
Playlist Presentations
- Folders can now be shared as a playlist presentation — a single link plays all projects in the folder in sequence
- Each project in the playlist has its own waveform player and transcript
- Playlist presentations respect the folder's "Anyone with link" setting, so no sign-in is required for public folders
Changelog
Added
- Storage abstraction layer (
application/storage/) withLocalStorage(filesystem, unchanged behaviour) andGCSStorage(Google Cloud Storage) implementations STORAGE_BACKENDconfig var (localorgcs)GCS_BUCKET_NAMEfor GCS mode- GCP Secret Manager integration in
config.py- secrets pulled automatically whenGOOGLE_CLOUD_PROJECTis set ManagedPathclass for safe lifecycle management of temp-downloaded filesaudio_workspace()context manager on storage backends for ffmpeg/pedalboard processing- Internal job HTTP endpoints for Cloud Scheduler, secured with
INTERNAL_TOKENbearer auth cloud/directory with GCP deployment config filesDockerfilefor Cloud Run deploymentdatabase/postgres/base_schema.sql- consolidated full schemadatabase/postgres/0.5_migration.sqlscripts/add_admin_user.py- helper to bootstrap admin accountsrequirements-gcp.txtwith GCP dependencies- Config directory restructure - env/secret files now organised under
config/local/,config/server/, andconfig/cloud/subfolders matching the three deploy modes - Modal token support (
MODAL_TOKEN_ID/MODAL_TOKEN_SECRET) pulled from Secret Manager and injected intoos.environ HF_TOKENlikewise injected- Added links to the
/donatepage throughout the website pages - Created
.drawiodiagrams of the three architectures and added them to theREADME.md - User UI states are now stored locally in the browser; they will be reverted to default if page cache is deleted
- Workspace panels (Waveform, Speakers, Transcript) can now be collapsed and expanded
- The focus/unfocus button will collapse/expand all the other panels
- Panels expand to fill space vacated by collapsed panels
- Double-clicking on a panel header will also focus/unfocus
- Added ability to share presentations of entire folders as playlists
- Added per-user role-based sharing for projects and folders (owner / editor / viewer)
- Added user search endpoint (
GET /api/users/search) for sharing dialogs - Added
GET /api/folders/<path>/permissionsendpoint - Added ownership transfer for projects and folders (demotes old owner to editor)
- Added optional email notifications when sharing a project or folder with a new user
- Added avatar stacks in the sidebar showing who a project or folder is shared with
- Added "Share…" and "Open Playlist" options to the folder context menu
- Added
shared_withfield to project and folder list API responses - Added
presentation_folder.js/presentation_folder.htmlfor folder playlist presentation pages - Added "No audio in this project" placeholder in presentation waveform panel
- Added project title heading to presentation transcript panel
- Added
static/js/utils/avatars.jsutility for rendering avatar stacks - Added dark/light/auto mode selector to presentation pages
- Website pages are now affected by UI theme choices
- Added injection protection tests:
test_injection_protection.py(30 tests) covering XSS escaping in embed HTML transcript spans, word-level text, and SEGS JSON<script>block; plus transcript JSON storage round-trip integrityTestNotifyTranscriptionComplete(8 tests) totest_notify.pyverifying HTML escaping of project name and display name in notification emails
Changed
CLOUD_URLadded toconfig/local/.env.local.exampleandconfig/server/.env.server.example- Bug reports in local/server modes now proxied to the cloud instance via
CLOUD_URLinstead of requiringGITLAB_TOKEN/GITLAB_PROJECT_IDsecrets locally - All project file I/O refactored from direct
Pathoperations tostorage()backend calls - Embed serving reads HTML via
storage().read_embed()instead ofsend_from_directory LOCAL_MODEboolean replaced byDEPLOY_MODEstring (local|server|cloud)config.pyselects config files based onDEPLOY_MODEdatabase/postgres/0.3_migration.sqlrenamed from0.2-0.3_migration.sql- Audio routing queries DB for
audio_formatand delegates directly tobackend.stream_responseinstead of usingget_audio_path/stream_file LocalStorage.put_from_localnow copies the file (was a no-op)GCSStorage.stream_responsecallsblob.reload()before reading size to get fresh metadata0.4_migration.sqlcolumn renames now idempotent (skips rename if target column already exists)- Audio ffmpeg thread hardening:
- Error handling hardened
src.release()guarded against assignment failures- Renamed many instances of the old WaveformStudio name to SourceQuote
- Fixed sidebar collapse button drawn inside the panel on page load
- New projects are now always opened on creation
- Standardized UI tooltips
- Fixed button condensation on waveform and transcript panels
- Fixed audio not playing immediately after uploading
- Fixed waveform minimap width not rendering correctly on page load
- Fixed unable to load new audio immediately after deleting audio from a project
- Reworked sharing pipeline to make it more streamlined and consistent
ShareDialognow supports projects and folders via atypeparameter- Permission GET endpoints now resolve user IDs to
{id, display_name, email}objects list_projectsandlist_directorynow batch-resolve shared user display namesPresentationWaveform,PresentationTranscript,PresentationController,PresentationSectionNav, andbuildProjectexported frompresentation.jsfor reuse by folder playlist pagepresentation.jsinit guarded so it does not auto-run when loaded as a library by the folder playlist page- Replaced all ascii/unicode iconographic characters with svg assets
- Mobile-optimized media transforms added for all frontend pages and the workspace
- Speaker colors now have adjustable saturation and lightness values in the theme settings
- Added theme selector to all pages
- Updated developer guides to match current project state
- Added ENVIRONMENT_SETUP.md as a detailed walkthrough on setting up the development environment
- Added DEPLOY_LOCAL.md, DEPLOY_SERVER.md, and DEPLOY_CLOUD.md as detailed walkthroughs on deploying each version of the project
- Fixed XSS in transcription-complete notification emails (
notify.py): project_nameanddisplay_namewere injected raw into HTML- Now escaped with
html.escape() - Fixed XSS in embed
<script>block (embed_generator.py): json.dumps(segs)did not escape<,>, or&, so transcript text containing</script>could break out of the script tag- Replaced with
_safe_json()which unicode-escapes those characters - Enabled stripe coupon codes
- Fixed unlikely injection flaw in
users.py - Fixed privilege escalation on
PUT /api/users/me/subscription: - Now requires
@admin_required - Subscription tier can only be set by admins or via Stripe webhooks
- Removed in-process token blacklist (
auth.py) - Logout now relies on Firebase's 1-hour token expiry, which is consistent across all workers and restarts
- Fixed arbitrary file upload in
save_audio(): - Now validates file extension against an allowlist and checks the MIME type
- Unsupported formats rejected with HTTP 400
- Added startup warning in
app.pyifFLASK_DEBUG=1is set withDEPLOY_MODE=serverorcloud - Added explicit
FLASK_DEBUG=0toconfig/cloud/.env.cloud - Fixed XSS via speaker names in
speakers_panel.js: - Segment picker header and delete-speaker dialog title were interpolated directly into
innerHTML - Now built with DOM methods and
textContent ConfirmDialogupdated to acceptstring | HTMLElementfor its title, eliminating the unsafe HTML path- Fixed DNS rebinding bypass in SSRF guard (
files.py): - Hostname was resolved once for the private-IP check, then re-resolved by the HTTP client — an attacker with a short-TTL record could return a public IP for the check and an internal IP (e.g. GCP metadata
169.254.169.254) for the actual request - Validated IP now pinned via a thread-local
getaddrinfooverride so all socket connections within the request reuse the same address - Fixed unauthenticated
POST /api/stripe/create-donation-intentwith no rate limit: - An attacker could create unlimited Stripe PaymentIntents, exhausting API quota
- Route now limited to 20 requests per hour per IP
- Hardened
GET /api/users/searchagainst user enumeration: - Minimum query length raised to 2 characters (blocks single-character sweeps)
- Result cap lowered from 10 to 5
- Fixed rate limiter using in-memory storage (
memory://) in multi-worker deployments: - Each gunicorn worker maintained its own counter, making effective limit N × configured limit
flask-limiter[redis]added to requirements- Startup warning now fires when
RATELIMIT_STORAGE_URIis unset inserver/cloudmode - Added runtime warnings for missing or malformed required environment variables
- Fixed host header injection in notification emails and Stripe callback URLs:
folder.py,project.py,admin.py, andstripe_routes.pyall constructed absolute URLs fromrequest.host_url, which an attacker could spoof via the HTTP Host header- URLs now built from a new
APP_BASE_URLconfig variable, with fallback torequest.host_urlonly when unset (local dev) - Startup warning fires when
APP_BASE_URLis unset in server/cloud mode - Fixed SQL injection risk in SQLite
RETURNINGclause handling (db_sqlite.py): ret_colswas extracted from user-supplied SQL via regex and interpolated directly into follow-upSELECTqueries- Now validated against a safe-identifier allowlist (alphanumeric/underscore or
*) before use; raisesValueErroron invalid input - Fixed XSS in pricing modal and version history dialog (
_landing_modals.html,version_history_dialog.js): - Subscription tier name was interpolated raw into
innerHTML; now escaped viaesc()helper - Error message in version history dialog replaced
innerHTMLtemplate literal withtextContentassignment - Fixed subscription
period_endcomputed locally instead of from Stripe (stripe_routes.py): checkout.session.completedcalculatednow + 30/365 daysfrom webhook metadata, drifting from Stripe's actualcurrent_period_end(trials, prorations, etc.)- Now retrieves the
stripe.Subscriptionobject and readscurrent_period_enddirectly; local fallback only used if no subscription ID is present - Fixed hardcoded
SECRET_KEYfallback in production (config.py): - If neither GCP Secret Manager nor the env var provided a value, Flask would silently use
'dev-secret-change-me', allowing session cookie forgery on misconfigured deployments - Non-local deploy modes now raise
RuntimeErrorifSECRET_KEYis unset; fallback only permitted inlocalmode - Fixed HTTP header injection via
Content-Dispositionfilename (files.py): download_namewas interpolated raw into the header value; a filename containing\r\nor"could inject arbitrary headers- Filename now RFC 5987-encoded with
urllib.parse.quote, usingfilename*=UTF-8''syntax - Added
pip-auditdependency vulnerability scan to CI pipeline (.gitlab-ci.yml): - New
dependency-auditstage runspip-auditagainst all three requirements files (requirements-gcp.txt,requirements-cpu.txt,requirements-gpu.txt) - Pipeline fails on any known CVE in pinned dependencies
- Fixed auth token exposure in audio and speaker sample URLs (
auth.py,server.js,presentation.js,waveform_panel.js,audio.js): login_requiredaccepted tokens via?token=query parameter, causing them to appear in server access logs, browser history, and referrer headers- Query-parameter token fallback removed; only the
X-Auth-Tokenrequest header is now accepted - Frontend audio and sample URLs no longer embed the token; WaveSurfer is initialised with
fetchParams.headerscarrying the auth header instead - Range-based peak extraction and duration detection (
extractPeaksFromUrl,_getDurationFromUrl) updated to forward auth headers to all internal fetch calls - Audio file downloads switched from bare URL navigation (which leaks the token) to
fetch()+ blob URL - Added rate limit to
GET /api/fetch-page-title: - Endpoint had no rate limiting, allowing authenticated users to use the server as a DoS amplifier or internal network scanner
- Now limited to 30 requests per minute per user
- Fixed verbose error messages leaking internal details to API clients (
files.py,project.py,folder.py,user.py): - Exception messages, internal paths, and system info were returned directly in JSON error responses throughout the codebase
- All HTTP 500 responses now return a generic
"An internal error occurred"message; full exception details are logged server-side only - 400/404 responses retain descriptive messages where they are intentional and user-facing
- Fixed raw ffmpeg stderr exposed to clients in the retranscribe endpoint (
files.py): CalledProcessError.stderrwas decoded and returned directly in the JSON response, leaking internal file paths, codec details, and server directory structure- Now returns
"Audio processing failed"; stderr is logged internally - Fixed permission JSON written to DB without schema validation (
project.py,folder.py): PUT /api/projects/<id>/permissionsandPUT /api/folders/<path>/permissionsaccepted arbitrary JSON and wrote it directly to the database- Incoming permissions object is now validated: must be a dict, only the known keys (
owners,editors,viewers,any_with_link,public) are accepted, and each role list must contain only strings - Fixed login dialog Enter key not working after tabbing between fields:
- Sign-in panel converted to a
<form>element so the browser fires a nativesubmitevent on Enter regardless of which field has focus - Enter in the email field now submits directly if the password field is already filled
focus()call on dialog open wrapped inrequestAnimationFrameto prevent silent failure when the dialog is opened from an async auth callback- Added
ALLOW_INDEXINGconfig variable to control search engine indexing: robots.txtis now served dynamically — blocks all crawlers unlessALLOW_INDEXING=true- When indexing is enabled,
/robots.txtincludes aSitemap:directive pointing to/sitemap.xml /sitemap.xmllists all public static pages; returns 404 when indexing is disabled- Open Graph, Twitter Card, description, and canonical meta tags added to the landing page when indexing is enabled;
noindexmeta tag added otherwise ALLOW_INDEXINGdefaults tofalsein all example configs; set totrueonly inconfig/cloud/.env.cloud.prod- Added email verification for password-based signups under open registration (
ALLOW_OPEN_REGISTRATION=true): - New users receive a Firebase-generated verification email on signup; account activates automatically on first login after verifying
- Password-auth users with unverified email are blocked at login with a dedicated "Verify Email" panel in the login dialog
- "Verify Email" panel shows the reason and a "Resend Verification Email" button that calls the new
POST /api/auth/resend-verificationendpoint - Signing in with an unverified account automatically signs the user out of Firebase to prevent an auth loop
- Signup modal and FAQ pages update their wording based on
ALLOW_OPEN_REGISTRATION— invite-only language is replaced with verification flow language when open registration is enabled ALLOW_OPEN_REGISTRATIONnow accepts1andyesin addition totrue- Closed beta flow (open registration disabled) is unchanged — admin must still manually activate accounts
- Fixed GCS storage backend using wrong blob path prefixes (
gcs.py): _blobconstructed paths as{project_id}/{file}but the bucket stores files underprojects/{project_id}/{file}_embed_blobused__embeds__/prefix instead ofembeds/- Both corrected to match the bucket layout written by the migration script
- Fixed GCS
NotFoundexception returning HTTP 500 instead of 404 (gcs.py): blob.reload()instream_responseraisedgoogle.api_core.exceptions.NotFoundwhen a file was missing, which was unhandled and surfaced as a 500- Now caught and re-raised as
FileNotFoundErrorso callers return a proper 404
Removed
- Individual per-table SQL files replaced by
base_schema.sqlto act as base for migration scripts
Misc
- Unit test conftest files updated for new config/storage structure