Production data should never leave a production environment and yet developers need access to coherent data sets to properly develop and test.
scrubfu is a .NET CLI tool that makes creating usable development data from production data a predictable, audit-able and repeatable process.
It is compatible with Postgres databases and utilizes the COMMENT's on table columns to indicate the type of scrubbing & obfuscation required.
Usage: scrubfu [OPTIONS] [INFILE] [OUTFILE]
[INFILE] is the input file obtained by a pg_dump.
This can also be standard out stream.
[OUTFILE] is the scrubfu'ed file, ready to be imported with pg_import.
This can also be the standard out stream.
Options:
-h, --help Show this message and exit.
-v, --version Show the version and exit.
--log TEXT Optional LOGFILE, defaults to standard out.
--log_level [error|info|debug] Used with [--log=LOGFILE].
git clone git@github.com:GrindrodBank/scrubfu.git
cd scrubfu
Log into the Docker Hub and pull scrubfu:
docker login
docker pull grindrodbank/scrubfu
Get the PostgreSQL client tools:
apt-get install postgresql-client
# Record PSQL Version
PSQLVERSION=`psql --version | awk '{print $3}'`
Get the scrubfu_sample.sql file:
cp doc/scrubfu_sample.sql /tmp/.
Start a PostgreSQL Docker container.
DBPS=`docker run -e POSTGRES_PASSWORD=pgpass -d -p 15432:5432 postgres:$PSQLVERSION`
# Confirm the container ID
echo $DBPS
Create the scrubfu_sample database and import the sample data:
PGPASSWORD=pgpass createdb -h localhost -p 15432 -U postgres scrubfu_sample
PGPASSWORD=pgpass psql -h localhost -p 15432 -U postgres scrubfu_sample < /tmp/scrubfu_sample.sql
# Confirm the tables have been imported
PGPASSWORD=pgpass psql -h localhost -p 15432 -U postgres scrubfu_sample -c "\dt"
# Dump the database out to SQL file again for later comparison
PGPASSWORD=pgpass pg_dump -h localhost -p 15432 -U postgres scrubfu_sample > /tmp/scrubfu_sample.sql
This option is ideal for scrubbing a pg_dump
process output in-line.
Pipe the pg_dump
and the scrubfu
commands.
PGPASSWORD=pgpass pg_dump -h localhost -p 15432 -U postgres scrubfu_sample | docker run --rm -a stdin -a stdout -i grindrodbank/scrubfu > /tmp/scrubfu_sample_scrubbed.sql
# Confirm output
cat /tmp/scrubfu_sample_scrubbed.sql | less
This option scrubs an already dumped file into a new file.
PGPASSWORD=pgpass pg_dump -h localhost -p 15432 -U postgres --file /tmp/scrubfu_sample.sql scrubfu_sample
docker run --rm -v /tmp:/tmp -i grindrodbank/scrubfu /tmp/scrubfu_sample.sql /tmp/scrubfu_sample_scrubbed.sql
# Confirm output
cat /tmp/scrubfu_sample_scrubbed.sql | less
The following commands show how to perform other operations.
Import a scrubbed file:
export PGPASSWORD="pgpass"
dropdb -h localhost -p 15432 -U postgres --if-exists scrubfu_sample_scrubbed
createdb -h localhost -p 15432 -U postgres scrubfu_sample_scrubbed
psql -h localhost -p 15432 -U postgres scrubfu_sample_scrubbed --file /tmp/scrubfu_sample_scrubbed.sql
# Confirm output
cat /tmp/scrubfu_sample_scrubbed.sql | less
Map a log file to local storage, and set log level to debug:
touch /tmp/scrubfu.log
docker run --rm -v /tmp:/tmp -v /tmp/scrubfu.log:/tmp/scrubfu.log -i grindrodbank/scrubfu --log /tmp/scrubfu.log --log_level debug /tmp/scrubfu_sample.sql /tmp/scrubfu_sample_scrubbed.sql
# Confirm log file output
cat /tmp/scrubfu.log | less
Perform an end to end export -> scrub -> import without writing any files:
export PGPASSWORD="pgpass"
dropdb -h localhost -p 15432 -U postgres --if-exists scrubfu_sample_scrubbed
createdb -h localhost -p 15432 -U postgres scrubfu_sample_scrubbed
pg_dump -h localhost -p 15432 -U postgres scrubfu_sample | docker run --rm -a stdin -a stdout -i grindrodbank/scrubfu | psql -h localhost -p 15432 -U postgres scrubfu_sample_scrubbed --file -
# Confirm the tables have been imported
PGPASSWORD=pgpass psql -h localhost -p 15432 -U postgres scrubfu_sample_scrubbed -c "select * from array_test;"
Stop docker containers:
docker stop $DBPS
docker rm $DBPS
All project documentation is currently available within the /doc folder.
© Copyright 2019, Grindrod Bank Limited, and distributed under the MIT License.