Scripting EC2

October 30th, 2006 by morgan

I wrote a shell script that will automate the creation of EC2 instances. Really, all it does is glue together the existing command line tools that Amazon provides in a fairly crude manner. However, it works and it makes my life easier, so I like it. Hopefully it can do the same for you ….

Things You MUST Understand

  • Every time you run this script you will be charged for at least one hour’s worth of time by Amazon, even if you shut things down immediately. These aren’t my rules, I have complained about them previously.
  • Make sure you use the ec2-terminate-instances script after running this script. If you don’t, you will be charged by Amazon until you shut it down.
  • You have to have the EC2 API Tools installed for this script to work.
  • You have to have an active EC2 account for this script to work.

Caveats

  • This script hasn’t been tested by anyone other than me. It works just fine for me, and I am able to use it on a regular basis. However, if you are looking for a polished, documented, or commercial product then this is the wrong place for you (unless you are willing to pay, of course).
  • The script is for UNIX, and was developed under Mac OS X. It may run under Windows with Cygwin, although I haven’t tried (and don’t really want to).
  • The script is in Korn. It doesn’t use any particularly odd syntax, so my guess is that it would run under other shells. To be honest I haven’t tried, simply because I use KSH so much with my current work that I am far more efficient with it than any other shell. If anyone wants to test it and/or clean it up to run more universally it would be most appreciated.
  • Every time you run this script you will be charged for at least one hour’s worth of time by Amazon. These aren’t my rules, I have complained about them previously.
  • Make sure you use the ec2-terminate-instances script after running this script. If you don’t, you will be charged by Amazon until you shut it down.

Licensing

This script is provide it under the MIT license. To summarize, it is provided as-is and can be used free of charge. I am not liable for your screw-ups.

OK, with all that being said, if you still want to use it you can download the script here.

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

Amazon EC2 Billing Discrepancy

October 13th, 2006 by morgan

I just found out the hard way that EC2 does not charge by the number of instance minutes used, but instead by the number of portions of an hour that individual instances use.  Sound the same?  Well, it depends on how you are using it :-(
I wrote a simple shell script that automates the process of creating an ec2 instance.  Not particularly complicated, but I had to run it a number of times until I got it to behave exactly the way I wanted to.  During this time, I was creating instances and shutting them down a few minutes later, so that I could test the program execution and flow.  I needed to do this a number of times, as I was trying to get everything just right.  I tested quite a bit with flat files, although it isn’t quite the same as a live run.

Unfortunately, this is means that for each of these attempts I was being charged for an hour’s worth of time, even though I each time I used less than 5 minutes worth of time.  This is a bit bewildering to me, as in my day job I work in Data Warehousing and I know that it is a relatively simple procedure to summarize the number of instance minutes used before billing the credit card.  So, I feel that I was overcharged by a significant amount, due to a lack of effort.   There have been billing disputes with other internet companies for similar practices, and they ended up in litigation.
To be fair:

  1. My entire bill so far has been $4.20, not incredibly high.  Still, I hardly think this hardly fair, considering I probably used less than $0.25 worth of actual time.
  2. This policy is spelled out in the AWS terms of service, once you dig in and look for it.  However, billing this way is not intuitive and I think hidden on purpose.

Regardless of legalities, I don’t think this is the right way to do things.  I hope that the folks at AWS will fix this before the beta ends, as it isn’t going to keep customers happy ::sigh::

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

links for 2006-10-04

October 4th, 2006 by morgan
Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

Getting Started With EC2

October 1st, 2006 by morgan

There is a lot of information out there about EC2, including comprehensive documentation from the AWS Team. If you need to know something in depth, go there for the best answers. This post is more about how to get started quickly with EC2 than anything else.

A Quick Introduction

According to the documentation, EC2

presents a true virtual computing environment, allowing you to use web service interfaces to requisition machines for use, load them with your custom application environment, manage your network’s access permissions, and run your image using as many or few systems as you desire.

This is a bit overtechnical for my tastes, but I am just a lowly customer. In simple terms, this means that you can

  • Create an image of what you would like a fully configured Linux box to look like.
  • Load this image to one or more machines quickly, easily, and on demand.
  • Pay a small per-minute charge for each machine that you use.

This sounds revolutionary, and to some extent it is. While other companies have tried it, the concept hasn’t really taken off. Part economics, part complexity, and it is a difficult nut to crack. Amazon’s real innovations are in pricing, licensing, and in leveraging an already existing infrastructure. Essentially, users of AWS will be paying for the infrastructure that Amazon.com is going to be using during peak usage (around Christmas, I would guess).

The system is very cool, but it does have some caveats, especially for organizational use:

  • While EC2 abstracts the hardware, it still leaves the difficulty of application and information logistics. As many have said, “Amateurs talk about strategy, dilettantes talk about tactics, and professionals talk about logistics.”
  • Data persistance is a genuine issue with EC2. An unexpected crash can be devastating if you are going to lose all your information without recourse.
  • Licensing becomes difficult under this scenario for a lot of enterprise software. Free software and open source makes this all possible.

With this in mind, let’s get rolling.

Super Quick Start

If you are really impatient and want to get rolling, all you have to do is:

  1. Follow my instructions for setting up AWS.
  2. If you are going to use S3 (and it is likely that you are), follow my instructions for using S3.
  3. Follow the EC2 instructions from overstimulate that show the basics of getting up and running with EC2.

Because someone already has pretty good instuctions on how to get started with EC2, I am not going to repeat them. Follow these steps and you will get a basic understanding of what is going on. You can be up and running with a generic Linux box that you can do your bidding with.

Some Customization

To Amazon’s credit, it is relatively easy to make an off-the-shelf Linux server from one of the ready made images. However, this isn’t what I am interested in. I am much more interested in the ability to make lots of identical machines that have my own customized specifications. While this isn’t quite as easy, it isn’t much harder providing you know what you are doing.

What I really want to do is have my own customized:

  • Users
  • Directories
  • Permissions
  • Software (Python 2.5 with SQLite, key-based access with SSH, access to S3 and SQS, etc.)

The Long Way Home 

It turns out that the amount of work that is required to get started with EC2 would make this post unbearably long.  So, if you are interested in more details, check out my later posting on establishing an EC2 presence.

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

Getting Started with S3

September 29th, 2006 by morgan

OK, once you have done the steps mentioned in the article “Getting Started with AWS” you should be able to actually do something. Probably the easiest service to start with is S3 (short for Simple Storage Service).

What is S3?

Good Question. The AWS folks describe S3 as:

… a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.

OK, so that is about half marketing-speak and about a third gobbledy gook. Imagine S3 as a very large, very simple place to store files. It isn’t a hierarchical file system, instead everything is stored in buckets which are containers for objects stored on S3. Also, each object has a key which uniquely identifies it. You use a bucket like a directory, an object like a file, and its key like the file name.

The documentation says that:

Every object in Amazon S3 can be uniquely addressed through the combination of the Service endpoint, bucket name, and key, as in http://s3.amazonaws.com/doc/2006-03-01/AmazonS3.wsdl, where “doc” is the name of the bucket, and “2006-03-01/AmazonS3.wsdl” is the key.

Actually Doing Something

This all sounds great, but in reality we want to actually do something with this giant file system. Now, here is the tricky part. S3 is really set up as a developers system, and the tools are in a bit of a rough form.

At this point, you have two choices:

  1. Buy a tool that has S3 support in it.
  2. Pick a language and code it yourself.
  3. Use ready made tools.

There are already lists of tools that support S3, and examples of #2 and #3 in the S3 Code Samples, and they are worth checking out. Each of them have varying degrees of configuration required, and what you use is really up to you. I would recommend #1 if you are using Windows or aren’t either an experienced developer or willing to learn the hard way.

I am a fortunate enough to use Mac OS X, so I really have my run of things I can use. I picked s3curl because I am lazy and don’t want to have to make, configure, or compile anything and am comfortable on the command line. This didn’t work (missing packages in PERL, methinks) so I switched to s3-shell, although I did have to install ant to do it. It worked fine, but it is missing some pretty basic things, like the ability to set access to the buckets and objects you have created. This won’t do!
The best solution I could find was jSh3ll. It is a ready made Java tool comes compiled and has a lot of nice command line options, including the ability to run scripted commands from an input file. So, for now that is that I am going to be giving descriptions with.

The Nitty Gritty

Now, start jSh3ll, and enter in the following commands (substituting in your own access_key_id and secret_access_id)

$ java -jar dist/jSh3ll.jar

Welcome to jSh3ll (Amazon S3 command shell for Java) (c) 2006 SilvaSoft, Inc.
Type ‘help’ for command list.

jSh3ll> host s3.amazonaws.com
jSh3ll> user access_key_id
jSh3ll> pass secret_access_id
jSh3ll>

At this point, you should be logged in. To see what commands are available, you can use help:

jSh3ll> help
bucket [bucketname]
count [prefix]
createbucket
delete
deleteall [prefix]
deletebucket
exit
get
getacl [’bucket’|'item’]
getfile
getfilez
gettorrent
head [’bucket’|'item’]
host [hostname]
list [prefix] [max]
listatom [prefix] [max]
listrss [prefix] [max]
listbuckets
pass [password]
put
putfile
putfilez
putfilewacl [’private’|'public-read’|'public-read-write’|'authenticated-read’]
putfilezwacl [’private’|'public-read’|'public-read-write’|'authenticated-read’]
quit
setacl [’bucket’|'item’] [’private’|'public-read’|'public-read-write’|'authenticated-read’]
time [’none’|'long’|'all’]
threads [num]
user [username]
jSh3ll>

Next, create a bucket (substitute its name for bucket-name below) and upload a file to that location.

jSh3ll> bucket bucket-name
Bucket set to ‘bucket-name
jSh3ll> createbucket
Created bucket ‘bucket-name
[runtime: 2.614s]
jSh3ll> list
Item list for bucket ‘bucket-name
jSh3ll>

This shouldn’t return anything, as the bucket should be empty. Now, let’s try uploading something (substitute your file’s name for file-name below):

jSh3ll> putfile key-name file-name
Stored item ‘bucket-name/key-name
jSh3ll> list
Item list for bucket ‘bucket-name
key=key-name, owner=user-name, size=XXXX bytes, last modified=Fri Sep 29 23:45:36 CDT 2006
[runtime: 2.077s]
jSh3ll>

At this point, you should see key-name, as this is the name for the file on the remote system. Also, at this point this exercise is costing you money! Just to really prove to yourself that it is there, you can look at it through your web browser. First, we have to open up the Access Control List (or ACL) so that it the bucket and object can be seen:

jSh3ll> setacl bucket bucket-name public-read
Set ACL for bucket ‘bucket-name‘ to public-read
jSh3ll> setacl item key-name public-read
Set ACL for item ‘bucket-name/key-name‘ to public-read

At this point, your file should be available at http://bucket-name.s3.amazonaws.com/key-name. If you wanted to be really clever and give your file a key-name like docs/critical/README then it would appear like a truly hierarchical file system.

Cleaning Up

The last thing you need to do is to delete the file and bucket to make sure you don’t incur any extra cost.

jSh3ll> delete key-name
Deleted ‘bucket-name/key-name
jSh3ll> deletebucket
Deleted bucket ‘bucket-name

To verify this you can check the URL again and make sure that you get back an error message.

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

about


Architected.info is a web site dedicated to information architecture, focusing on transformation and understanding. We focus on these categories through the lens of organizational dynamics, looking at people, practices, and relationships.

Morgan Goeller is the author and maintainer of this website. He has worked as an architect and engineer, specializing in software development, web applications, database engineering, ETL, and information quality.

search

navigation

archives

categories