This repository was archived by the owner on Jun 27, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
* s3sb has now been superseded by a Python rewrite - http://github.com/BigglesZX/pys3sb *
License
BigglesZX/s3sb
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
////////////////////////////////////////////////////////////////////////// s3sb has been superseded by a Python rewrite known helpfully as pys3sb http://github.com/BigglesZX/pys3sb no further updates to s3sb will be committed - please check out pys3sb /////////////////////////////////////\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ ////////////////// \\\\\\\\\\\\\\\\\\ /////////// _.-~^^~-._ \\\\\\\\\\\ /////// .-~' s3sb '~-. \\\\\\\ //// { S3 Site Backup } \\\\ // \ by / \\ // : BigglesZX : \\ // " " \\ // * * \\ /// \\\ // / / / / / / / / / / / / / / / / / \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \\ // 0x00 . Intro ///////////////////////////////////////////////////////// Welcome to the s3sb README. This file should, if you're lucky (and if I've finished writing it), contain everything you need to know to get s3sb running in your environment. This file will change along with s3sb itself, as different features are improved and old behaviours retired. The time I have available to work on projects like s3sb varies immensely, so I'll try to make sure this file stays up to date - even in describing behaviours that will soon be retired - so that you can use s3sb whatever state it's in. // 0x01 . What is s3sb? ///////////////////////////////////////////////// In one sentence, s3sb - S3 Site Backup - is a Bash shell script designed to back up whole websites - files and database contents - on to Amazon S3. It allows you to specify a directory where files are located, as well as connection details for the database used by that site, and when executed it will pack up those files into a .tar.gz file, dump the database into a .sql.gz file, then upload both those files to an Amazon S3 account, filing them neatly into a bucket named after the site. Timestamps are included in the filenames to help future retrieval. How useful! s3sb was designed to fulfil a need I had in a shared hosting environment. Your environment may differ, so I'm trying to ensure that s3sb can cope with a variety of hosting scenarios so that it can be as helpful to you as it is to me. It probably won't be perfect yet, but I'm working on it. s3sb makes use of the s3-bash library created by Raphael James Cohn (http://code.google.com/p/s3-bash/). The library hasn't been updated since version 0.02 in late 2007, so I have made a few tweaks to the code to remedy a few issues. Accordingly, s3sb includes the s3-bash files required to get the whole thing running. As s3-bash is licensed under Apache License 2.0 (http://www.apache.org/licenses/LICENSE-2.0), so is s3sb. See the LICENSE file for more details. // 0x02 . Prerequisites ///////////////////////////////////////////////// s3sb is designed to be run from shell or crontab, so you'll need a UNIX- like environment in which to do that. s3sb should run without issue on Mac OS X, but this hasn't been tested as much as Linux. Windows (Cygwin) has not been tested at all. You're welcome to submit patches to address any OS-specific deficiency. The heart of s3sb's functionality is of course its interfacing with Amazon S3. So you'll need an Amazon Web Services (AWS) account, first and foremost. Once you've initialised S3 on your AWS account, you'll need your "key" and "secret key" strings in order to allow s3sb access to your S3 account. Instructions for setting up your key and secret keys in the s3sb config files are provided in the Installation and Configuration section. Since s3sb includes the s3-bash library, you will also need to satisfy the dependencies of that library. They are GNU coreutils, curl and openssl. Please see the documentation for your system for instructions on getting those packages installed (you've probably got them installed already, however, if you're using any recent version of a Linux distribution). As previously mentioned, I've tweaked the bundled version of s3-bash to fix a few issues that were present in the stock code. Please see the s3-bash Issues page on Google Code (http://code.google.com/p/s3-bash/issues/list) for full details on the bugs. The fixes are: > Issues #6 and #11 A bug in the way headers were signed, fixed thanks to the patch provided by batlogg on Issue #11. > Issue #7 A URL-forming bug preventing the use of European S3 buckets, fixed thanks to the patch provided by jon.keys > Issue #17 URLencode entities (e.g. %20) in filenames would break put operations. Fixed thanks to the suggestion in the Issue comments from "admin". Feel free to use the code from my updated s3-bash files to patch your own s3-bash installation. The Google Code project for s3-bash has seen very little activity since the 0.02 release in 2007 so it's unlikely that official updates will be forthcoming any time soon. If there are any updates in the future I'll try and merge them into s3sb where appropriate. // 0x03 . Installation and Configuration //////////////////////////////// Installation of s3sb is pretty simple - just clone the git repos or grab the files by more manual means. Stick them in a sensibly named directory on the machine from which you wish to make your backups. You could call it "s3sb", "s3sitebackup", or even "awesomebackuptool". The choice is yours. Once the files are in place, you'll probably need to make the main s3sb script executable. In most UNIX-like environments, this can be accomplished with the following command (substitute the name of your s3sb directory from the previous step): $ cd awesomebackuptool/ $ chmod u+x s3sb Note that (as with all command-line examples in this README) the $ indicates your shell prompt, and should not be typed :). The next step is to configure your AWS key and secret key. At the moment these are stored in two text files in the lib/ directory. Note: for reasons I won't go into here, the files containing your key and secret key must be exactly the right length - the key file must be 20 bytes and the secret key file 40 bytes. What I've found usually happens is that most text editors append a newline to the end of your file when editing, creating a horrid undesirable extra byte. Oh noes! Thankfully, avoiding this is easy. Browse to the lib/ directory in your s3sb installation, copy your AWS key (not secret key) to the clipboard, and do the following: $ echo -n "<paste your key>" > key.txt Ignore the <angle brackets>, but keep the double quotes! This will create key.txt, containing your AWS key and no trailing newline, phew. Creating your secret key file is a very similar process - copy your AWS secret key to the clipboard, then do: $ echo -n "<paste your secret key>" > skey.txt Once again, ignore the angle brackets, but keep the double quotes. Now, MOST IMPORTANTLY, if you're installing s3sb in a shared environment (e.g. a server on which there are users besides yourself), you MUST change the permissions on the key and secret key files so that they're readable only by your user. In fact, it's good practice to do this even if there aren't any other users on your server, as it's possible there will be in the future and you can bet you won't remember that your secret key file is world readable. So, back in the lib/ directory, run the following command: $ chmod 600 key.txt skey.txt This will reset the permissions on your key files so they are only readable by your user. Nobody else will be able to view their contents. You can check the permissions were applied correctly by running (in lib/): $ ls -lash *key.txt You should see something like this: 0 -rw------- 1 youruser yourgroup 20 2009-05-31 05:02 key.txt 0 -rw------- 1 youruser yourgroup 40 2009-05-31 03:14 skey.txt The permission settings appear in the second column. Yours should look exactly like this. If you're in any doubt here, please consult your operating system documentation. Incidentally, the reason s3sb uses files to store your keys and does't just read them off the command line is because anything passed as an argument to a command can be viewed by other users on the system (in "ps", for example). And we don't want anyone getting hold of your S3 keys for obvious reasons. Finally, create a new bucket on your S3 account named "s3sb" (this will be configurable later). Your backups will be placed in this bucket. // 0x04 . Usage ///////////////////////////////////////////////////////// Currently, s3sb takes a bunch of parameters on the command line in order to configure the files and data that will be involved in your backup. One of the first things on my TODO list is to rejig s3sb so that sites are set up with individual "profiles" in a central config location. That hasn't quite happened yet however, so for the time being s3sb must be given all the necessary parameters on the command line. s3sb runs in one of three "modes" - "files", "db" or "all". "Files" mode will back up only the files in your site (not the database); "db" mode only backs up the database (not the files); while the "all" mode will back up everything. These modes exist so that you can back up files and db on different schedules - for example I have a bunch of sites where the files change far less often than the data, so I might back up the files every week and the data every day. Regardless of what mode you run s3sb in, the parameters it takes are the same. You'll need the following info: i. A 'friendly name' for your site (e.g. "acmewidgets") - this is used to name the directory and archive files into which your backups are packed and stored. No spaces or non-alphanumeric characters. ii. The hostname for your database server where the site's database is hosted (it should also be accessible from the machine where the backup is being run - chances are it will be) - you can also use an IP address. iii. The username you want to use to connect to the database. iv. The password for the above user. v. The name of the database you want to connect to. vi. The directory on your server where the website files that you want to back up reside. I usually set this to include the entire website, but you may wish to only back up a specific subdirectory, for whatever reason. Running s3sb with no arguments/parameters will give you the usage message, which describes where the info above needs to go: $ ./s3sb Usage: ./s3sb [all | files | db] sitename dbhost dbuser dbpassfile dbname dirname/ The only cryptic argument here should be "dbpassfile". As mentioned in the previous section, we don't want to enter passwords as arguments on the command line, as these are visible to other users. So, put your db password (for our example "acme" site) into its own text file, as we did with the AWS keys above: $ echo -n "<acmedbpass>" > acme.txt $ chmod 600 acme.txt This will save your password into acme.txt and set its permissions so only your user can read it. If this seems awkward it's because it is - in a near-future version we'll be moving this config into a dedicated file as previously mentioned. So, to back up our example "acmewidgets" site, you'd run something like the following: $ ./s3sb all acmewidgets mysql.acmewidgets.com acmedb /home/youruser/acme.txt acmewidgets /var/www/acme-public/ Note that I've split the above command onto multiple lines for easier readability; when you type it out in your shell it should all be on the same line. s3sb will fire up, and you should see output similar to the following: s3sb: starting backup of acmewidgets (mode: all)... dumping database... archiving site files... tar: Removing leading `../' from member names uploading database dump... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 6439 0 0 100 6439 0 29870 --:--:-- --:--:-- --:--:-- 101k uploading site archive... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 10.9M 0 0 100 10.9M 0 936k 0:00:11 0:00:11 --:--:-- 1137k cleaning up... done Start your S3 browser of choice (I can recommend Cyberduck for OS X or the Amazon S3 Firefox Organizer - https://addons.mozilla.org/en-US/firefox/addon/3247) and connect to your S3 account. Browse to the bucket you created earlier (probably called s3sb). Within it you should see a directory named after your site (acmewidgets), and within that should be two files, one .tar.gz containing your site files, and the other a .sql.gz containing your database dump. Download them, unpack them somewhere and check that all is in order - make sure they include everything you're expecting. You may now want to set up a cron job to run s3sb on a schedule. See the documentation for cron on your system for more info on this. You'll want to keep email notifications turned on the first few times to make sure your backup is working properly. Also bear in mind the point in the Known Issues section below about concurrency. // 0x05 . Known Issues ////////////////////////////////////////////////// s3sb is a work in progress - I've mostly been focussing on preparing my original icky code for a github release (and writing this long-ass file), so needless to say there are issues in the current s3sb code which you should be aware of. These are being worked on - they're just not fixed yet :). i. Awkward config: As previously mentioned, having to specify all the options for a backup right there on the command line every time s3sb is run isn't the most convenient arrangement. Soon(TM), s3sb will use a new central config file that'll allow you to set up individual site "profiles" that contain all the config info s3sb needs, allowing you to start a backup run with a much shorter command. It'll also save you from having to deal with separate password files for each site. Phew. ii. Error handling: Right now s3sb just runs through all the steps in the dump/package/upload cycle without doing much error checking to properly deal with failures in one or more steps. Additionally s3sb itself doesn't return a meaningful error code, making it tricky to use it in conjunction with other scripts. This is in the TODO list and will eventually be written in. iii. S3 security: Make sure you trust anyone else who has access to your S3 account, as the backup files that s3sb produces are not encrypted. This is being considered as a future feature. iv. Concurrency: I don't have any reason to suspect that this will be a problem, but I run all my s3sb cron jobs with a few minutes' space between them, to ward off any possible concurrency issues. I do this only because I haven't tested it. Future versions will be tested in this area :). If you discover any other issues while using s3sb (and it's likely there will be others), feel free to report them on github and I'll do my best to fix 'em. // 0x06 . Roadmap /////////////////////////////////////////////////////// The following things are on the list for near-future development (in no particular order). i. Centralised config/site profiles: Mentioned a bunch of times above, this is what I want to get written into s3sb ASAP. ii. Error handling: As mentioned in Known Issues, s3sb needs to handle errors properly. iii. Concurrency testing: Need to ensure as much as possible that s3sb can run multiple instances concurrently without creating horrible issues in space/time. iv. Backup pruning: I'd like to add the ability for s3sb to automagically prune backups older than a specified age from the S3 bucket. This strikes me as being a tiny bit tricky but it's on the wishlist. v. Backup encryption: As mentioned above, the backups that s3sb puts on S3 aren't encrypted. It would be nice to have this as a feature in the future, perhaps using something simple like passworded AES. vi. Granular DB backup: I'd like to provide more control over the database side of the backup process, in as much as drilling down to include/exclude specific tables. vii. Granular file backup: As with the previous point, more granular control over the files included in the backup would be good, e.g. the ability to exclude certain files/dirs. viii.Add size report for generated archives. viv. Check for existence of key files; s3-bash libs, etc vv. Check permissions on key files when run. That's it for now. Feel free to suggest other features you think would be useful. I can't promise to be able to add them in a timely fashion (you could always submit your own patch), but I'll definitely consider all suggestions. If you made it this far, congratulations! Your dedication to the README is much appreciated. Hope you enjoy s3sb - let me know how you get on using it :). This is my first proper OSS release and it would give me great pleasure to know that someone else is using and enjoying something I wrote :). Happy landings, BigglesZX http://github.com/BigglesZX/ http://github.com/BigglesZX/s3sb // EOF /////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////
About
* s3sb has now been superseded by a Python rewrite - http://github.com/BigglesZX/pys3sb *
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published