Skip to content

Setup GWells data migration for local dev test

garywong-bc edited this page Feb 13, 2018 · 7 revisions

Overview

The architecture for the migration strategy depends on two databases: the legacy wells database, and the novel gwells database. The gwells database is configured with a foreign data wrapper to access the legacy wells database. A series of scripts are run on the novel gwells database to continually migrate the structure of the database over the progression of development. The data is reloaded periodically depending on development or testing needs. It is important to note that the gwells database consists of two distinct sets of objects which share the public schema - the application native tables which are prefixed by gwells_ and the framework tables which have different prefixes including django_ and auth_.

Setup Process

For the setup process, it may be beneficial to use virtual environments and environment variables. The instructions below are written with the assumption that environment variables exists in the current context.

1. Create novel and legacy POSTGRES users and databases

//connect to the postgres database as a superuser
psql -U${POSTGRES_SUPERUSER} -dpostgres

--> create the novel user and the novel database
CREATE USER ${DATABASE_USER} WITH createdb;
CREATE DATABASE ${DATABASE_NAME} WITH owner=${DATABASE_USER};

--> create the legacy user and legacy database
CREATE USER ${LEGACY_DATABASE_USER} WITH createdb;
CREATE DATABASE ${LEGACY_DATABASE_NAME} WITH owner=${LEGACY_DATABASE_USER};

2. Configure template1 with inheritable extensions.

The pgcrypto extension is required to create uuids. The postgres_fdw extension is required to create the foreign data wrapper extension that allows the novel database to extract data from the legacy database. The postgres_fdw could be installed in the novel database only rather than in template1, but this approach simplifies scripting.

//connect to the postgres database as a superuser
psql -U${POSTGRES_SUPERUSER} -dpostgres

--> create extensions 
CREATE EXTENSION pgcrypto
CREATE EXTENSION postgres_fdw

3. Restore the legacy database from a database dump

//restore the legacy database
pg_restore --dbname postgresql://${LEGACY_DATABASE_USER}:${LEGACY_DATABASE_USER_PASSWORD}@127.0.0.1:5432/${LEGACY_DATABASE_NAME} --no-owner --no-privileges ${LEGACY_DATABASE_DUMP_LOCATION}

Note that there is now a script to export, and download to local, the Oracle legacy tables (via the Foreign Data Wrapper on moe-gwells-test). It will only export what is replicated (DB_REPLICATE=None/Subset/Full) and currently imports into the local DB as 'wells' schema; a future version of this script will support any accessible schema such as 'public'.

4. Add the foreign data wrapper to the novel database

//connect to psql
psql -U${DATABASE_USER} -d${DATABASE_NAME}
-->create the foreign data wrapper
DROP SERVER IF EXISTS ${LEGACY_DATABASE_NAME} CASCADE;
CREATE SERVER ${LEGACY_DATABASE_NAME} FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'wells');
DROP USER MAPPING IF EXISTS FOR public SERVER ${LEGACY_DATABASE_NAME};
CREATE USER MAPPING FOR PUBLIC SERVER ${LEGACY_DATABASE_NAME} OPTIONS (user '${LEGACY_DATABASE_USER}', password '${LEGACY_DATABASE_USER_PASSWORD}');
IMPORT FOREIGN SCHEMA ${LEGACY_DATABASE_NAME} FROM SERVER ${LEGACY_DATABASE_NAME} INTO ${LEGACY_SCHEMA};
GRANT USAGE ON SCHEMA ${LEGACY_SCHEMA_NAME} TO ${DATABASE_USER};
GRANT SELECT ON ALL TABLES IN SCHEMA ${LEGACY_SCHEMA} TO ${DATABASE_USER};
GRANT USAGE ON FOREIGN SERVER ${LEGACY_DATABASE_NAME} TO ${DATABASE_USER};

To validate the the foreign data wrapper is functioning, use \d to list all artefacts in the gwells database. Because the foreign data tables are not treated in precisely the same way as native tables they are not listed when applying a \dt command.

5. Perform the python migration of the novel database

steps 5 is included in the gwells/database/scripts/load_db.sh script

//make migrations to adapt the current schema to the current model -- this is for developers only
python manage.py makemigrations

//migrate from the prior database schema to the current database schema
python manage.py migrate

6. Migrate the data from the legacy database to the novel database

steps 6 is included in the gwells/database/scripts/load_db.sh script

//truncate the old data while maintaining the new structure --> gwells/database/scripts/clear-tables.sql

//load static data from code tables. this is stored in csv files in gwells/database/code-tables --> \include data-load-static-codes.sql

//create the transform table for the wells_wells to gwells_well data migration --> gwells/database/scripts/create-xform-gwells-well-ETL-table.sql
//the transform table is used as part of a pipeline which performs some code translations

//populate the transform table from the legacy wells_wells table --> gwells/database/scripts/populate-xform-gwells-well.sql

//complete the wells data migration from the transform table --> gwells/database/scripts/populate-gwells-well-from-xform.sql

//migrate the screens data from the legacy database to the novel database --> gwells/database/scripts/migrate_screens.sql

psql --dbname postgresql://${DATABASE_USER}:${DATABASE_USER_PASSWORD}@127.0.0.1:5432/${DATABASE_NAME} <<EOF
\include clear-tables.sql
\include data-load-static-codes.sql
\include create-xform-gwells-well-ETL-table.sql
\include populate-xform-gwells-well.sql
\include populate-gwells-well-from-xform.sql 
\include migrate_screens.sql
EOF

Note that the post-deploy script will refresh all lookup codes, as well as replicate from the foreign tables. The db-replicate script will assume the lookup code tables are up-to-date and just replicate from the foreign tables.

7. Log into psql to make sure that everything worked (i.e. \d wells.*)

Clone this wiki locally