Posts tagged with "S3"

AWS security

Aka DARK WEB HACKER COST ME $1600 SHOCK HORROR !

After I set up my Jekyll site and uploaded the content to Amazon S3 using s3website, I remember thinking I must re-read that section about securing the configuration file with AWS credentials in plain text'.

If the source code of your website is publicly available, ensure that the s3website.yml file is in the list of ignored files. For git users this means that the file .gitignore should mention the s3website.yml file.

So, I duly added 's3website.yml' to .gitignore and pushed to GitHub. I wasn't sure whether this file exclusion only took effect from now so I checked if the file was still in the repository. Unsurprisingly, it was but GitHub provided detailed instructions on how to resolve this.

So, job done and as my AWS credentials were only exposed for 57 minutes, no harm done.

I went for lunch and returned to a voicemail from Amazon customer services asking me to contact them urgently about a 'security issue'. I also had an email and an AWS support case titled 'Your AWS account is compromised' describing, in detail, what corrective action I should take to promptly resolve the situation.

My heart sank a little as I followed the instructions and examined the list of EC instances running. 'Hmm, that's strange, I don't remember setting up 10 instances called "Ghost" in every region...'

I quickly terminated each instance and went to check my billing information. Phew. Usage for today was $0.00. Then I remembered a possible reason; in the dim and distant past, I experimented with a pre-built EC instance running Ghost. Maybe that was the reason but, deep down, I knew this wasn't the case as they had all been started today and I don't think 'ghost' was referring to the blogging platform.

Next I had to lockdown my AWS setup. First, although I already had a user account, I deleted the access keys associated with the 'root' account and changed my Amazon password. I also deleted the existing user and group, re-created them with new keys and followed the guidelines and best practice recommendations in the Identity and Access Management user guide.

Then I enabled multi-factor authentication (MFA) for the AWS root account. This means that access is secured by the requirement to enter a 6 digit code from my mobile phone using Google Authenticator.

The following morning, I logged onto AWS and checked my bill. In a short period of time, the imposter had clocked up $1600 worth of charges despite Amazon locking down the account once they detected the compromise. I contacted Amazon customer support who said they would refund the excess charges due to this 'mishap' and would notify me once this was 'approved'. A little ambiguous but hopefully, I will get reimbursed although strictly speaking, this 'mishap' was down to my own stupidity.

Finally, I did what I should have done in the first place and move the s3website configuration file elsewhere completely outside of the project directory and specify the location when sync'ing the site.

$ s3_website push --config-dir ~/.s3_website

Otherwise, I can anticipate that if I change themes or platforms, I will repeat this idiotic error and Amazon may not be as understanding next time.

Now, that it looks like the episode might be over, I am struck at how quickly Amazon detected the appearance of my AWS keys on GitHub. I presume they have a automated bot looking for this type of data so maybe it's not uncommon. Secondly, what benefit did the hacker gain ?

He ran 40 EC instances for a while before being detected and shutdown. Why ? Just because he could ? In a way, I wish I'd more time to investigate precisely what was running on the instances.

from GitHub pages to Amazon S3

Although hosting on GitHub pages is an excellent option, I decided to move this blog to Amazon S3, mainly because I have used S3 before. Also GitHub pages only supports a limited set of Jekyll plugins and I wanted more flexibility to add any plugin and (potentially) run a different version of Jekyll.

I also took the opportunity to switch the theme to the rather minimalistic but stylish Poole and installed the useful s3website utility to automatically synchronise the static site to Amazon S3. As you are charged for upload/download traffic, an intelligent sync process (rather than uploading everything) for deployment is important.

Amazon customer service

I am currently hosting this site on Amazon's Simple Storage Service (S3). For the first 12 months I am eligible for the Free Usage Tier pricing.

The Free Tier isn't completely free but includes '5 GB of Amazon S3 standard storage, 20,000 Get Requests, and 2,000 Put Requests'.

Initially, I had to test, review and deploy the entire site a few times before I got things right and Google's crawler was busy re-indexing the site so I wasn't wholly surprised when September's bill was a measly 15 cents.

The breakdown was as follows:

  • S3 storage $0.01
  • GET requests $0.03
  • PUT requests $0.08
  • Tax $0.03

The only element that puzzled me was the S3 storage which is free for up to 5GB. I checked the size of the site which is just 21MB (all images are outsourced to Picasa).

$ du -sh public
21M    public

I sent an email to Amazon customer service asking for clarification - not because I can't afford a penny - but because I would like to understand the pricing structure ready for when the 12 month Free Tier period expires.

In the interim period, I found the answer on the AWS FAQ - the Free Tier assumes Standard S3 Storage will be used and I was using the following 's3cmd' to deploy my site.

s3cmd sync --acl-public --reduced-redundancy public/* s3://#{s3_bucket}/

The choice of the Reduced Redundancy Storage option makes sense as this normally costs less ($0.093 per GB) than standard storage ($0.125 per GB) and this is a low traffic website (and I have multiple backups).

However, this caveat is actually covered in the last section of the FAQ

Does the AWS free tier include Amazon S3 Reduced Redundancy Storage (RRS)?

No, the AWS free tier does not include Amazon S3 RRS storage. The AWS free tier includes 5 GB of Amazon S3 standard storage, which offers the highest Amazon S3 durability.

A couple of days later I received a response from a Amazon Customer Service rep who confirmed that Reduced Redundancy Storage wasn't covered by the free tier, apologised for the misunderstanding and applied a $5 credit to my AWS account for the 'inconvenience caused'. For me, this will probably equate to 3 years 'free' hosting.

Once again, fantastic customer service from Amazon. I was originally thinking of investigating altenative hosting options when the 12 month period expires but, on reflection, I don't think I will bother.

migration complete

The last ever migration of this blog is now complete. This blog is now powered by Octopress and is a statically generated site hosted on Amazon S3.

All posts have been migrated from HTML to Markdown and every single permalink (all 954 of them) have been painstakingly checked, rationalised and consolidated.

To achieve this, I simply generated a sitemap of the Drupal site and compared this with a sitemap for a test site using Octopress after the data migration.

This unveiled a few issues that needed to be fixed:

  • Posts with the identical slug had a numeric suffix which was often incorrect or inconsistent after being mangled by various blogging platforms.
  • Some posts had the incorrect publication date (due to timezone shift) so were typically a day out.
  • Some posts were just missing after the 'exitwp' script was used to migrate from WordPress to Hyde a year ago.
  • Hyde uses a slightly different header format from Jekyll but 'sed' was able to fix this.
  • Jekyll uses a trailing slash for each post URL whereas Drupal doesn't.
  • Amazon S3 requires the canonical URL to be www.site.com with a automatic redirect to point site.com to the correct URL with the www prefix. Previously, I favoured the naked URL 'site.com'.

The permalink structure is now 'site.com/yyyy/mm/dd/hello-world/' (with a trailing slash) and will never change. Ever. Again.

I also resurrected some orphan Disqus comments by using the URL mapping tool which works brilliantly and helped identify comments associated with a non-existent URL.

I am generally delighted with Octopress as it bundles so many features I need for a blog (Disqus, Google Analytics etc) and is much easier than using raw Jekyll.

The only vague disappointment is the fact that the entire site is re-published even after a single post has been added. On my Aspire One netbook, a 'rake generate' takes 8 minutes. I might try the same process on my work laptop (faster, newer Lenovo Thinkpad) for comparison purposes.

Inevitably, there is a Jekyll fork that supports incremental deployment but the Octopress author is (understandably) reluctant to base Octopress on a fork that could quickly become stale.

Publishing the site to Amazon S3 is slightly better but, as Atom feeds get regenerated for categories, this still takes around 4 minutes.

Still, maybe this lengthy publishing process will encourage me to properly preview and get my posts perfect before publishing.

I am not sure about having all 954 posts stored in a single directory; I would rather have a sub-directory for each year but then again, being able to quickly search all posts for a keyword using 'grep' is useful.

I decided to keep the Feedburner integration for now (to avoid losing my two readers).

The use of a statically generated site also killed one of my favourite features - my legendary and award winning rotating tagline. Oh well.

Blogging like a hacker but publishing like a snail with a heavy weight strapped to his back.

migration plan

Loose thoughts on the plan of attack for the blog migration:

  1. Install Octopress locally
  2. Configure S3 and install a dummy Web site.
  3. Use's3cmd' to upload test site to Amazon S3
  4. Test incremental uploads. This is a firm requirement.
  5. Full database backup of existing Drupal blog
  6. Take backup of Drupal installation (additional modules, scripts).
  7. Install vanilla Drupal 7 locally.
  8. Install copy of the existing Dupal blog in local version (overwrite database ?).
  9. Use the Drupal to Octopress migration script. This extracts nodes from the database and creates Markdown files for each post, This script is probably for Drupal 6 so some tweaks (major rewrite) may be needed for bleeding edge Drupal 7. URL aliasing is supposedly supported.
  10. Test the various elements in the checklist. Disqus comments need the correct domain name so will have to come last.
  11. Configure a redirect from 'nbrightside.com' to the Amazon URL. I can see trouble and lots of Googl'ing here.
  12. Place source code (Markdown posts) into GitHub repository.
  13. Put kettle on.

Autumn migration

My Web hosting package (provided by Bluehost) expires in October. As this blog is essentially dead (the last post was a one-liner 8 months ago), the sensible and logical thing to do would be to kill the blog and save £5 a month.

Originally I purchased the domain name 'nbrightside.com' and the Web hosting for a couple of reasons:

  • I wanted to use self hosted WordPress without some of the restrictions imposed by WordPress.com
  • I wanted to play with some of the packaged applications offered by Bluehost.
  • I wanted access to a Linux environment, mainly to build, install, experiment with various open source software tools and packages which needed a LAMP stack.

It's really questionable whether I need to maintain this Web presence but, on balance, I'd like to keep the site alive for a little longer.

WordPress, Drupal, Habari et al are all fantastic blogging platforms but rather overkill for this simple, single user blog. For a while, I have been fascinated and trying to resist the temptation of the simplicity and power of static Web site generators like Jekyll and Hyde.

Last year, I even ported the complete contents of this Drupal 7 blog to a locally installed version of Hyde and laboriously fixed up lots of hyperlinks just so the Markdown looked neater.

The completely logical and sensible decision would be to simply resurrect this Hyde environment, re-sync the last couple of one liner blog posts, configure a automatic redirect and use rsync to upload this site to some alternative, cheaper (or free) Web hosting.

So, I have decided to use Octopress and Amazon S3 to host this humble, annually updated blog in the future. I may be able to reuse some of the Hyde content with judicious use of sed to convert the meta data in the header sections or I may just start afresh.

No - I am not mad.