<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.0.6" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>Architected Information</title>
	<link>http://www.architected.info/blog</link>
	<description>How people, practices, and information are transformed into relationships and understanding.</description>
	<pubDate>Mon, 21 Jul 2008 15:06:45 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.6</generator>
	<language>en</language>
			<item>
		<title>Improving EC2 Addressing</title>
		<link>http://www.architected.info/blog/improving-ec2-addressing</link>
		<comments>http://www.architected.info/blog/improving-ec2-addressing#comments</comments>
		<pubDate>Tue, 13 Mar 2007 14:41:32 +0000</pubDate>
		<dc:creator>morgan</dc:creator>
		
		<category>Uncategorized</category>

		<category>Over the Horizon</category>

		<category>Enterprise Web</category>

		<category>AWS</category>

		<guid isPermaLink="false">http://www.architected.info/blog/improving-ec2-addressing</guid>
		<description><![CDATA[Under The Radar has an interesting post about overcoming EC2&#8217;s weaknesses, dynamic IP addressing and 24&#215;7 operations.  The folks at WeoGeo have designed an application called WeoCEO that supposedly addresses these issues.  I think this is a very exciting development, especially when combined with the ability to use S3 as a file system.
The hype about [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blogs.weogeo.com/pbissett" onclick="javascript:urchinTracker ('/outbound/article/blogs.weogeo.com');">Under The Radar</a> has an interesting post about <a href="http://blogs.weogeo.com/pbissett/index.php/2007/03/09/weoceo-%E2%80%93-how-to-use-the-true-power-of-amazon-web-services/" onclick="javascript:urchinTracker ('/outbound/article/blogs.weogeo.com');">overcoming EC2&#8217;s weaknesses</a>, dynamic IP addressing and 24&#215;7 operations.  The folks at <a href="http://www.weogeo.com" onclick="javascript:urchinTracker ('/outbound/article/www.weogeo.com');">WeoGeo</a> have designed an application called <a href="http://blogs.weogeo.com/pbissett/index.php/2007/03/09/weoceo-%e2%80%93-how-to-use-the-true-power-of-amazon-web-services/#comment-2" onclick="javascript:urchinTracker ('/outbound/article/blogs.weogeo.com');">WeoCEO</a> that supposedly addresses these issues.  I think this is a <em><strong>very</strong></em> exciting development, especially when combined with <a href="http://www.architected.info/blog/using-s3-as-a-file-system" >the ability to use S3 as a file system</a>.</p>
<p>The hype about EC2 is that it enables people to &#8216;rent-a-cloud&#8217;, paying for and using as many or as few servers as they wish at any given time.  The problem is that currently these servers are limited in ways that seem minor until you start working with them on practical matters.  This is really visible when <a href="http://blog.apokalyptik.com/?p=128" onclick="javascript:urchinTracker ('/outbound/article/blog.apokalyptik.com');">working with data and databases on EC2</a>, essentially you have to either change either your tools or your paradigm.  The problem then becomes one of economics, where you try to balance the savings from renting a server and storage against the cost of solving a problem with tools that require a lot of customization.<br />
I would love to be a part of the WeoCEO beta, and see how things work along with <a href="http://www.openfount.com/blog/s3infidisk-for-ec2" onclick="javascript:urchinTracker ('/outbound/article/www.openfount.com');">S3Infinidisk</a>.  If the problems of addresssing and persistence are solved, then we truly have the ability to scale easily without large-scale customization.  Taking care of these is a giant leap forward, and will help EC2 truly live up to the hype that it has generated in the developer community.
</p>
<div class="sociable"><span class="sociable_tagline"><strong>Share and earn some karma ...</strong><span>These icons link to social bookmarking sites where readers can share and discover new web pages.</span></span><ul>
	<li><a href="http://del.icio.us/post?url=http://www.architected.info/blog/improving-ec2-addressing&amp;title=Improving+EC2+Addressing" title="del.icio.us" onclick="javascript:urchinTracker ('/outbound/article/del.icio.us');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/delicious.png" alt="del.icio.us" /></a></li>
	<li><a href="http://digg.com/submit?phase=2&amp;url=http://www.architected.info/blog/improving-ec2-addressing&amp;title=Improving+EC2+Addressing" title="digg" onclick="javascript:urchinTracker ('/outbound/article/digg.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/digg.png" alt="digg" /></a></li>
	<li><a href="http://www.furl.net/storeIt.jsp?u=http://www.architected.info/blog/improving-ec2-addressing&amp;t=Improving+EC2+Addressing" title="Furl" onclick="javascript:urchinTracker ('/outbound/article/www.furl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/furl.png" alt="Furl" /></a></li>
	<li><a href="http://www.newsvine.com/_tools/seed&amp;save?u=http://www.architected.info/blog/improving-ec2-addressing&amp;h=Improving+EC2+Addressing" title="NewsVine" onclick="javascript:urchinTracker ('/outbound/article/www.newsvine.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/newsvine.png" alt="NewsVine" /></a></li>
	<li><a href="http://reddit.com/submit?url=http://www.architected.info/blog/improving-ec2-addressing&amp;title=Improving+EC2+Addressing" title="Reddit" onclick="javascript:urchinTracker ('/outbound/article/reddit.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/reddit.png" alt="Reddit" /></a></li>
	<li><a href="http://www.spurl.net/spurl.php?url=http://www.architected.info/blog/improving-ec2-addressing&amp;title=Improving+EC2+Addressing" title="Spurl" onclick="javascript:urchinTracker ('/outbound/article/www.spurl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/spurl.png" alt="Spurl" /></a></li>
</ul></div>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/improving-ec2-addressing/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Using S3 as a File System</title>
		<link>http://www.architected.info/blog/using-s3-as-a-file-system</link>
		<comments>http://www.architected.info/blog/using-s3-as-a-file-system#comments</comments>
		<pubDate>Wed, 17 Jan 2007 22:41:05 +0000</pubDate>
		<dc:creator>morgan</dc:creator>
		
		<category>Information Architecture</category>

		<category>Systems Integration</category>

		<category>Over the Horizon</category>

		<category>Enterprise Web</category>

		<category>AWS</category>

		<guid isPermaLink="false">http://www.architected.info/blog/using-s3-as-a-file-system</guid>
		<description><![CDATA[Openfount has released S3Infindisk for EC2, a product that answers a lot of wishes in the EC2 community.  One of the biggest issues with EC2 is the lack of persistent storage on the server instances, and this tool is a first attempt at a solution.
Basically, when you are using EC2, everything on your server [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://openfount.com/blog/" onclick="javascript:urchinTracker ('/outbound/article/openfount.com');">Openfount</a> has released <a href="http://www.openfount.com/blog/s3infidisk-for-ec2" onclick="javascript:urchinTracker ('/outbound/article/www.openfount.com');">S3Infindisk for EC2</a>, a product that answers a lot of wishes in the <a href="http://www.amazon.com/gp/browse.html?node=201590011" onclick="javascript:urchinTracker ('/outbound/article/www.amazon.com');">EC2</a> community.  One of the biggest issues with EC2 is the lack of persistent storage on the server instances, and this tool is a first attempt at a solution.</p>
<p>Basically, when you are using EC2, everything on your server that isn&#8217;t statically defined before the machine runs goes bye-bye as soon as the machine reboots.  Not a real problem for software, but for data this is a major bummer.  Amazon makes <a href="http://www.amazon.com/gp/browse.html?node=16427261" onclick="javascript:urchinTracker ('/outbound/article/www.amazon.com');">S3</a> storage available without data transfer fees, which is wonderful.  However, it takes real effort to transfer data back and forth between the systems (with <a href="http://www.architected.info/blog/getting-started-with-s3" >something like jsh3ll</a>), and most data-centric tools (especially database servers) expect real-time access to a working file system.</p>
<p>S3Infinidisk bridges this gap, allowing an EC2 instance to use S3 like a real Linux filesystem.  While it is a bit of hack, it allows data tools to work the way they need to and it allows an EC2 instance to take full advantage of the <a href="http://www.amazon.com/gp/browse.html?node=3435361" onclick="javascript:urchinTracker ('/outbound/article/www.amazon.com');">AWS</a> environment.  <em><strong>This is a huge step forward in making EC2 a more usable environment for utility computing!</strong></em></p>
<p>I haven&#8217;t had a chance to try the product out yet, but am excited to do so.  I appreciate the licensing structure (free single-user version + commercial high-performance version), although I would prefer seeing open source.  Also, since the tool is based on the <a href="http://fuse.sourceforge.net/" onclick="javascript:urchinTracker ('/outbound/article/fuse.sourceforge.net');">FUSE</a> subsystem, I could easily see this spreading like wildfire.
</p>
<div class="sociable"><span class="sociable_tagline"><strong>Share and earn some karma ...</strong><span>These icons link to social bookmarking sites where readers can share and discover new web pages.</span></span><ul>
	<li><a href="http://del.icio.us/post?url=http://www.architected.info/blog/using-s3-as-a-file-system&amp;title=Using+S3+as+a+File+System" title="del.icio.us" onclick="javascript:urchinTracker ('/outbound/article/del.icio.us');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/delicious.png" alt="del.icio.us" /></a></li>
	<li><a href="http://digg.com/submit?phase=2&amp;url=http://www.architected.info/blog/using-s3-as-a-file-system&amp;title=Using+S3+as+a+File+System" title="digg" onclick="javascript:urchinTracker ('/outbound/article/digg.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/digg.png" alt="digg" /></a></li>
	<li><a href="http://www.furl.net/storeIt.jsp?u=http://www.architected.info/blog/using-s3-as-a-file-system&amp;t=Using+S3+as+a+File+System" title="Furl" onclick="javascript:urchinTracker ('/outbound/article/www.furl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/furl.png" alt="Furl" /></a></li>
	<li><a href="http://www.newsvine.com/_tools/seed&amp;save?u=http://www.architected.info/blog/using-s3-as-a-file-system&amp;h=Using+S3+as+a+File+System" title="NewsVine" onclick="javascript:urchinTracker ('/outbound/article/www.newsvine.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/newsvine.png" alt="NewsVine" /></a></li>
	<li><a href="http://reddit.com/submit?url=http://www.architected.info/blog/using-s3-as-a-file-system&amp;title=Using+S3+as+a+File+System" title="Reddit" onclick="javascript:urchinTracker ('/outbound/article/reddit.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/reddit.png" alt="Reddit" /></a></li>
	<li><a href="http://www.spurl.net/spurl.php?url=http://www.architected.info/blog/using-s3-as-a-file-system&amp;title=Using+S3+as+a+File+System" title="Spurl" onclick="javascript:urchinTracker ('/outbound/article/www.spurl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/spurl.png" alt="Spurl" /></a></li>
</ul></div>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/using-s3-as-a-file-system/feed/</wfw:commentRss>
		</item>
		<item>
		<title>EC2, Licensing, and Competitive Advantage</title>
		<link>http://www.architected.info/blog/ec2-licensing-and-competitive-advantage</link>
		<comments>http://www.architected.info/blog/ec2-licensing-and-competitive-advantage#comments</comments>
		<pubDate>Fri, 08 Dec 2006 18:29:59 +0000</pubDate>
		<dc:creator>morgan</dc:creator>
		
		<category>ETL</category>

		<category>Information Architecture</category>

		<category>Systems Integration</category>

		<category>Over the Horizon</category>

		<category>Appliances</category>

		<category>AWS</category>

		<guid isPermaLink="false">http://www.architected.info/blog/ec2-licensing-and-competitive-advantage</guid>
		<description><![CDATA[I have been working with Amazon Web Services (and EC2) a lot lately, and have made some observations that really fly in the face of conventional wisdom.
I work in ETL, which means I need to get a hold of big iron to crunch on big data.   Machines are expensive, licenses are expensive, storage [...]]]></description>
			<content:encoded><![CDATA[<p>I have been working with <a href="http://aws.amazon.com/" onclick="javascript:urchinTracker ('/outbound/article/aws.amazon.com');">Amazon Web Services</a> (and <a href="http://aws.amazon.com/ec2" onclick="javascript:urchinTracker ('/outbound/article/aws.amazon.com');">EC2</a>) a lot lately, and have made some observations that really fly in the face of conventional wisdom.</p>
<p>I work in <a href="http://www.architected.info/blog/category/transformation/etl/" >ETL</a>, which means I need to get a hold of big iron to crunch on big data.   Machines are expensive, licenses are expensive, storage and networks are cheap.  Scalability is important, but measured.AWS would be perfect for sourcing ETL jobs that are one-offs or are particularly large or complex. However, the major vendors are very particular about making sure that their products are only installed on authorized machines.  They make it pretty difficult for you to cluster easily, especially if you are a little guy just starting out.</p>
<p>This is antithetical to the AWS approach to problems.  Here, machines and storage are dirt cheap, networks are pretty cheap, and scalability is paramount.  The most difficult thing is arranging a problem so that it can be worked on by your <a href="http://en.wikipedia.org/wiki/Infinite_monkey_theorem" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">infinite monkeys</a>, in the form of EC2 instances.  The biggest problem then becomes licensing.</p>
<p>In a highly scalable environment, it is incredibly compelling it becomes to use easily licensed software.  Compelling to the point where it becomes worth it to build your own tools instead of purchasing off the shelf.  For example, for a web server I could use <a href="http://www.apache.org" onclick="javascript:urchinTracker ('/outbound/article/www.apache.org');">Apache</a> or <a href="http://www.ibm.com/software/websphere" onclick="javascript:urchinTracker ('/outbound/article/www.ibm.com');">Websphere</a>.  Apache is free, and I can install it on my instance with absolutely no problems (as a matter of fact, it is pre-installed).  With Websphere I am going to have to purchase a license (or more), then monkey with the fact that it will be installed on a new machine with a new hostname each time.  You can make the same argument for MySQL vs. Oracle, or Python vs. .NET.</p>
<p>Now this isn&#8217;t an anti-corporate rant, not by a long shot.  But, I think it is a valid way to look at how licensing will be a competitive advantage in the future.  Software vendors should start looking at their products in terms of AWS and other compute farms, especially at the enterprise level.  Those who don&#8217;t get out in front of this are going to find their lunches eaten, and quickly.  There is quite a hype around Web 2.0 companies these days, this could be a great way for someone to get their foot in the door of the Fortune 1000.</p>
<p>Perhaps Richard Stallman should send a Christmas card from <a href="http://www.firstmonday.org/issues/issue3_3/raymond/" onclick="javascript:urchinTracker ('/outbound/article/www.firstmonday.org');">the bazaar</a> to Jeff Bezos over at the cathedral this year &#8230;
</p>
<div class="sociable"><span class="sociable_tagline"><strong>Share and earn some karma ...</strong><span>These icons link to social bookmarking sites where readers can share and discover new web pages.</span></span><ul>
	<li><a href="http://del.icio.us/post?url=http://www.architected.info/blog/ec2-licensing-and-competitive-advantage&amp;title=EC2%2C+Licensing%2C+and+Competitive+Advantage" title="del.icio.us" onclick="javascript:urchinTracker ('/outbound/article/del.icio.us');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/delicious.png" alt="del.icio.us" /></a></li>
	<li><a href="http://digg.com/submit?phase=2&amp;url=http://www.architected.info/blog/ec2-licensing-and-competitive-advantage&amp;title=EC2%2C+Licensing%2C+and+Competitive+Advantage" title="digg" onclick="javascript:urchinTracker ('/outbound/article/digg.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/digg.png" alt="digg" /></a></li>
	<li><a href="http://www.furl.net/storeIt.jsp?u=http://www.architected.info/blog/ec2-licensing-and-competitive-advantage&amp;t=EC2%2C+Licensing%2C+and+Competitive+Advantage" title="Furl" onclick="javascript:urchinTracker ('/outbound/article/www.furl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/furl.png" alt="Furl" /></a></li>
	<li><a href="http://www.newsvine.com/_tools/seed&amp;save?u=http://www.architected.info/blog/ec2-licensing-and-competitive-advantage&amp;h=EC2%2C+Licensing%2C+and+Competitive+Advantage" title="NewsVine" onclick="javascript:urchinTracker ('/outbound/article/www.newsvine.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/newsvine.png" alt="NewsVine" /></a></li>
	<li><a href="http://reddit.com/submit?url=http://www.architected.info/blog/ec2-licensing-and-competitive-advantage&amp;title=EC2%2C+Licensing%2C+and+Competitive+Advantage" title="Reddit" onclick="javascript:urchinTracker ('/outbound/article/reddit.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/reddit.png" alt="Reddit" /></a></li>
	<li><a href="http://www.spurl.net/spurl.php?url=http://www.architected.info/blog/ec2-licensing-and-competitive-advantage&amp;title=EC2%2C+Licensing%2C+and+Competitive+Advantage" title="Spurl" onclick="javascript:urchinTracker ('/outbound/article/www.spurl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/spurl.png" alt="Spurl" /></a></li>
</ul></div>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/ec2-licensing-and-competitive-advantage/feed/</wfw:commentRss>
		</item>
		<item>
		<title>EC2 &#8212; Dynamic or Static (or Both)?</title>
		<link>http://www.architected.info/blog/ec2-dynamic-or-static-or-both</link>
		<comments>http://www.architected.info/blog/ec2-dynamic-or-static-or-both#comments</comments>
		<pubDate>Fri, 01 Dec 2006 21:57:00 +0000</pubDate>
		<dc:creator>morgan</dc:creator>
		
		<category>Information Architecture</category>

		<category>Systems Integration</category>

		<category>Automation</category>

		<category>Over the Horizon</category>

		<category>AWS</category>

		<guid isPermaLink="false">http://www.architected.info/blog/ec2-dynamic-or-static-or-both</guid>
		<description><![CDATA[A Problem
I have been working more and more with AWS and EC2 and one of the challenges in working with EC2 is dealing with the fact that each instance gets a dynamic IP address upon creation.  This makes it easy easy to crank out a large number of instances, which is a key feature [...]]]></description>
			<content:encoded><![CDATA[<p><em><strong>A Problem</strong></em></p>
<p>I have been working more and more with <a href="http://www.architected.info/blog/category/over-the-horizon/aws/" >AWS</a> and <a href="http://aws.amazon.com/ec2" onclick="javascript:urchinTracker ('/outbound/article/aws.amazon.com');">EC2</a> and one of the challenges in <a href="http://www.architected.info/blog/?s=ec2-&#038;submit=Go" >working with EC2</a> is dealing with the fact that each instance gets a dynamic IP address upon creation.  This makes it easy easy to crank out a large number of instances, which is a key feature of the system.  At the same time, it makes it difficult to find and manage those instances in an automated, systematic way.  So, there is a disconnect here.</p>
<p><em><strong>A Solution</strong></em></p>
<p>A decent solution is <a href="http://en.wikipedia.org/wiki/Dynamic_dns" onclick="javascript:urchinTracker ('/outbound/article/en.wikipedia.org');">Dynamic DNS</a>.  That is, to have your instance be assigned a easily recognized hostname as it is being started, but still keeps its dynamic IP address and creation.  To me, this seems like the best solution, as it lets AWS folks be good at what they are good at (providing cool technology infrastructures) and allows its users to be good what they are good at (making cool applications that use the infrastructure).</p>
<p>How can this be done?  Well, it takes a couple of steps:</p>
<ol>
<li>Establish an account with one of the myriad <a href="http://developer.amazonwebservices.com/connect/thread.jspa?threadID=11489&#038;start=15&#038;tstart=0" onclick="javascript:urchinTracker ('/outbound/article/developer.amazonwebservices.com');">DDNS providers</a>.</li>
<li>Configure your instance to use the DDNS software upon boot-up.</li>
<li>Profit!</li>
</ol>
<p>I have gotten some feedback about possibly using <a href="http://www.amazon.com/gp/browse.html/ref=sc_fe_c_1_3435361_6/104-0379937-1687105?%5Fencoding=UTF8&#038;node=13584001&#038;no=3435361&#038;me=A36L942TSJ2AJA" onclick="javascript:urchinTracker ('/outbound/article/www.amazon.com');">SQS</a> to do something similar to this.  I actually thought of this, but there are issues here around cost (cost per message) and configuration (<a href="http://developer.amazonwebservices.com/connect/thread.jspa?messageID=42365&#42365" onclick="javascript:urchinTracker ('/outbound/article/developer.amazonwebservices.com');">duplicates and ordering</a>).  Because I would like to maximize the system&#8217;s reliability and scalability, I would probably rule these out.<em><strong>Which Brings Another Problem &#8230;</strong></em></p>
<p>There is a caveat here, and it isn&#8217;t a small one if you don&#8217;t want to write code.  The big problem is that every instance you create is going to behave identically.  So, if you open up multiple instances each thinking they are &#8216;dynamic-name-1..com&#8217; then you will not get the results you are looking for.   Instead, most likely that name will be assigned to the last instance that started and the others will run around headless.</p>
<p><em><strong>Which Requires a Hack &#8230;</strong></em></p>
<p>This leaves a couple of options. First, you can write a script to go through your pool of potential host names, pick one that isn&#8217;t being used, and then request that name from the DDNS provider. Better yet is to pass the Dynamic DNS value you want to the instance through the keypair.</p>
<p>Of course, the best solution would be for someone to write a script that would make each node self-configure itself to get a dynamic hostname.  I would think this would be something that would be attractive to most of the <a href="http://developer.amazonwebservices.com/connect/thread.jspa?threadID=11489&#038;start=15&#038;tstart=0" onclick="javascript:urchinTracker ('/outbound/article/developer.amazonwebservices.com');">DDNS providers</a>, perhaps one of them will read this and get cracking.  If I don&#8217;t spot anything in the near future I will probably write a simple KSH to do this.<br />
<em><strong>Conclusion</strong></em></p>
<p>I think that at some point the folks at AWS are going to allow for some type of host identification, either through passing parameters to instances or by renting out static IP addresses or subdomain ranges. However, for now it will take some DIY in order to make this happen.</p>
<p>To be honest, probably the best solution here is to approach the problem as if you will never have a static IP address or host name and go from there. It will probably force you to think about your solution differently and challenge you to come up with a more flexible, scalable solution. It isn&#8217;t going to fit for every type of problem, but I think it works for the types of things that AWS is inherently good at.
</p>
<div class="sociable"><span class="sociable_tagline"><strong>Share and earn some karma ...</strong><span>These icons link to social bookmarking sites where readers can share and discover new web pages.</span></span><ul>
	<li><a href="http://del.icio.us/post?url=http://www.architected.info/blog/ec2-dynamic-or-static-or-both&amp;title=EC2+--+Dynamic+or+Static+%28or+Both%29%3F" title="del.icio.us" onclick="javascript:urchinTracker ('/outbound/article/del.icio.us');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/delicious.png" alt="del.icio.us" /></a></li>
	<li><a href="http://digg.com/submit?phase=2&amp;url=http://www.architected.info/blog/ec2-dynamic-or-static-or-both&amp;title=EC2+--+Dynamic+or+Static+%28or+Both%29%3F" title="digg" onclick="javascript:urchinTracker ('/outbound/article/digg.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/digg.png" alt="digg" /></a></li>
	<li><a href="http://www.furl.net/storeIt.jsp?u=http://www.architected.info/blog/ec2-dynamic-or-static-or-both&amp;t=EC2+--+Dynamic+or+Static+%28or+Both%29%3F" title="Furl" onclick="javascript:urchinTracker ('/outbound/article/www.furl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/furl.png" alt="Furl" /></a></li>
	<li><a href="http://www.newsvine.com/_tools/seed&amp;save?u=http://www.architected.info/blog/ec2-dynamic-or-static-or-both&amp;h=EC2+--+Dynamic+or+Static+%28or+Both%29%3F" title="NewsVine" onclick="javascript:urchinTracker ('/outbound/article/www.newsvine.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/newsvine.png" alt="NewsVine" /></a></li>
	<li><a href="http://reddit.com/submit?url=http://www.architected.info/blog/ec2-dynamic-or-static-or-both&amp;title=EC2+--+Dynamic+or+Static+%28or+Both%29%3F" title="Reddit" onclick="javascript:urchinTracker ('/outbound/article/reddit.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/reddit.png" alt="Reddit" /></a></li>
	<li><a href="http://www.spurl.net/spurl.php?url=http://www.architected.info/blog/ec2-dynamic-or-static-or-both&amp;title=EC2+--+Dynamic+or+Static+%28or+Both%29%3F" title="Spurl" onclick="javascript:urchinTracker ('/outbound/article/www.spurl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/spurl.png" alt="Spurl" /></a></li>
</ul></div>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/ec2-dynamic-or-static-or-both/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Establishing an EC2 Presence</title>
		<link>http://www.architected.info/blog/establishing-an-ec2-presence</link>
		<comments>http://www.architected.info/blog/establishing-an-ec2-presence#comments</comments>
		<pubDate>Wed, 01 Nov 2006 20:48:48 +0000</pubDate>
		<dc:creator>morgan</dc:creator>
		
		<category>Over the Horizon</category>

		<category>AWS</category>

		<guid isPermaLink="false">http://www.architected.info/blog/establishing-an-ec2-presence</guid>
		<description><![CDATA[A while back I thought I would sit down and spend a bit of time writing about how easy it was to roll your own cluster. I would do it myself, take a few hours to get things up and around, then record my thoughts for prosperity. Well, it took a lot longer than I [...]]]></description>
			<content:encoded><![CDATA[<p>A while back I thought I would sit down and spend a bit of time writing about how easy it was to roll your own cluster. I would do it myself, take a few hours to get things up and around, then record my thoughts for prosperity. Well, it took a lot longer than I thought I would, but here we are &#8230;<br />
Now, don&#8217;t get me wrong, it isn&#8217;t that anything about EC2 is particularly difficult, if you are a techie and are comfortable with the UNIX side of things. And, once you get things up and running it is very easy to use the infrastructure reliably and repeatably, just like the marketing hype says. However, getting to that point is harder than you might think.</p>
<p><strong>How Do I Get Started?</strong></p>
<p>If you are thinking about establishing a presence on EC2, the first thing you need to understand is persistence. Or, more accurately, the lack of it. An EC2 server forgets (or at least <a href="http://sa.muel.org/go/index.php/2006/10/02/24-whose-turn-is-it-to-clean-up-the-bits" onclick="javascript:urchinTracker ('/outbound/article/sa.muel.org');">should forget</a>) everything about itself every time it is rebooted, and has to be told what to load as it starts. This is very different from the standard desktop computing paradigm, where after a reboot you might lose any unsaved information. Not so with EC2.</p>
<p>Personally, I like to think of an EC2 server is the main character from <a href="http://www.imdb.com/title/tt0209144/" onclick="javascript:urchinTracker ('/outbound/article/www.imdb.com');">Memento</a>, who forgets everything that happened the day before whenever he goes to sleep. If you haven&#8217;t seen the movie, a good analogy would be if you carried around a CD-ROM with you and plugged it into any PC you saw and did your work from there.</p>
<p>This means that:</p>
<ol type="1" start="1">
<li class="MsoNormal">You have to figure out what      you need before the machine is rebooted and make an &#8220;image&#8221; of      what your machine looks like ahead of time</li>
<li class="MsoNormal">You can&#8217;t change your      configuration very easily, as you will have to update your      &#8220;image&#8221; in order to keep the changes for the future.</li>
<li class="MsoNormal">You can&#8217;t really store recent      data in your &#8220;image&#8221;, as it will all go away the next time you      restart.</li>
</ol>
<p>Most importantly, your entire system (OS, programs, and all) need to be stored in a way that it can be loaded quickly and easily. In the world of EC2, this is called an <strong><em>image</em></strong>.</p>
<p><strong>Image Is Everything</strong></p>
<p>Because the memory of a machine is wiped each time it is rebooted, it can be configured any way it needs to be, but it needs to be told exactly how to configure itself. The actual snapshot of the system that is loaded when the machine starts is called an <strong><em>image</em></strong>. It consists of all of the software and configuration instructions needed to run a Linux server in the AWS environment.</p>
<p>So, if you are planning to use EC2 to do real work, the thing you should be most concerned with to get started is creating a image that has all the software that you need to get things done. Amazon makes a number of images readily available, including ones with MySQL and Apache. However, if you need more than this (and the most basic UNIX tools) you need to create your own image, often called an <strong><em>AMI</em></strong>.</p>
<p><strong>Provisioning</strong></p>
<p>I consider provisioning to encompass everything that needs to be done in order to create a re-usable AMI for use with EC2. The easiest way to create an AMI is to take one that already works and modify it. The cheapest way to do this is to create an instance on an EC2 machine, modify it, then save the results. This article will focus on provisioning in the cheapest and easiest ways that I can find.</p>
<p>This takes several steps:</p>
<ul type="disc">
<li class="MsoNormal">Creating an instance</li>
<li class="MsoNormal">Adding and configuring users</li>
<li class="MsoNormal">Installing and configuring      software</li>
<li class="MsoNormal">Setting up the environment</li>
<li class="MsoNormal">Bundling the volume</li>
</ul>
<p>Before we get started, realize that this can be a long and time consuming process. Because you will have an EC2 instance running while you go through these steps and will be transferring data to that instance, it will cost you money to do this. Caveat emptor!<br />
Also, I would most strongly recommend that as you go through the steps you have your terminal application record your keystrokes to a file, to ensure that you can go back through and repeat yourself easily in case you have to start over.</p>
<p class="MsoNormal"><strong><em>Creating an Instance</em></strong>You can use <a href="http://www.architected.info/blog/scripting-ec2" >my shell script</a> for or follow <a href="http://woss.name/2006/09/19/setting-up-an-amazon-ec2-server-with-fedora-core-5/" onclick="javascript:urchinTracker ('/outbound/article/woss.name');">some other instructions</a> to create your instance. nce it is created you can use Telnet or SSH to login as the root user.</p>
<p><strong><em>Adding and Configuring Users and Groups </em></strong></p>
<p>When an instance is first created, the first thing that needs to be done is to create UNIX users to do the work itself, as running with the root user is a security risk. I would suggest that you create a superuser and give them the rights to do whatever it is you need to have done for administration.</p>
<p>You would do this by connecting as the root user and running the commands:</p>
<p>adduser <em>superuser-name</em><br />
createpasswd <em>superuser-name</em></p>
<p>At this point the user exists. We now need to give them the rights to manipulate the system as a superuser.</p>
<p>To do this we need to give them the ability to use the sudo command. To do this you would run the command <a href="http://www.courtesan.com/sudo/man/visudo.html" onclick="javascript:urchinTracker ('/outbound/article/www.courtesan.com');">visudo</a> (this allows you to use <a href="http://www.vim.org" onclick="javascript:urchinTracker ('/outbound/article/www.vim.org');">vi</a> to edit the /etc/sudoers file safely). Once visduo is running, search for the line for the root user, which looks like:</p>
<p>root ALL = (ALL) ALL</p>
<p>Then, add a line that looks like:</p>
<p><em>superuser-name</em> ALL = (ALL) ALL</p>
<p>At this point you may wish to configure this or any any other users and groups that you know that you would want in advance, such as the default shell, user profiles, etc. This will totally depend on what you are doing with the machine, so think carefully.</p>
<p>I would recommend that you make as few users with as little access as possible, for security&#8217;s sake.</p>
<p><strong><em>Installing and Configuring Software</em></strong></p>
<p>An EC2 volume is based on the Fedora core, for better or for worse. This means that we have some pretty standard tools available to us. However, we will have to download and compile the source, which will actually cost a bit of money. Not a ton, mind you, but enough to be concerned about.</p>
<p>I followed <a href="http://woss.name/2006/09/19/setting-up-an-amazon-ec2-server-with-fedora-core-5/" onclick="javascript:urchinTracker ('/outbound/article/woss.name');">these instructions</a> and upgraded my box to Fedora Core 5 first thing. This gives a pretty broad swath of tools to use, and is a good start. Everything isn&#8217;t cutting edge, but it is stable and very standard, which is what I am looking for in an image. This probably involves several hundred MB of file transfer, which will be charged at the going rate ($0.20/GB as of this writing).<br />
I also installed <a href="http://gcc.gnu.org/" onclick="javascript:urchinTracker ('/outbound/article/gcc.gnu.org');">GCC</a> , as I knew I might need it later. <a href="http://linux.duke.edu/projects/yum/" onclick="javascript:urchinTracker ('/outbound/article/linux.duke.edu');">Yum</a> is available, so I will use that when I can. The command:</p>
<p>yum install gcc*</p>
<p>will get the ball rolling.</p>
<p>The real things that I needed to get on the image were <a href="http://www.python.org" onclick="javascript:urchinTracker ('/outbound/article/www.python.org');">Python</a> <a href="http://www.python.org/download/releases/2.5/" onclick="javascript:urchinTracker ('/outbound/article/www.python.org');">2.5</a>, the <a href="http://www.scipy.org/Installing_SciPy/Linux" onclick="javascript:urchinTracker ('/outbound/article/www.scipy.org');">SciPy</a> package, and <a href="http://www.sqlite.org" onclick="javascript:urchinTracker ('/outbound/article/www.sqlite.org');">SQLite</a> 3. I went ahead and downloaded and compiled each of these from source because they weren&#8217;t available in yum and also to make sure they didn&#8217;t interfere with anything else. As soon as all three are available using yum I will get them that way.</p>
<p>In order to run the tools that we are going to use to make our volume, we need to install <a href="http://java.sun.com" onclick="javascript:urchinTracker ('/outbound/article/java.sun.com');">Java</a>. There are <a href="http://java.sun.com/j2se/1.5.0/jre/install-linux.html,%20http://java.sun.com/javase/downloads/index.jsp" onclick="javascript:urchinTracker ('/outbound/article/java.sun.com');">detailed instructions</a> on how to do this, it is a relatively painless install. When you do this, you will need to make sure that the JAVA_HOME variable is set up correctly.</p>
<p>Lastly, I would recommend <a href="http://quark.phy.bnl.gov/www/sshsetup.html" onclick="javascript:urchinTracker ('/outbound/article/quark.phy.bnl.gov');">setting up SSH</a> for the superuser at a minimum. If you have other users set up, I would do the same for them, as you will then have secure, passwordless access to any of your instances in the future.<br />
My guess is that this entire process will entail less than 1 GB of file transfers, and it could be a lot less if you don&#8217;t want to use Fedora Core. Because you are downloading to your image, it will only need to be done once.</p>
<p><strong><em>Setting Up the Environment</em></strong></p>
<p>First, you will need to set up the <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=351&#038;categoryID=88" onclick="javascript:urchinTracker ('/outbound/article/developer.amazonwebservices.com');">EC2 tools</a>, as specified in the <a href="http://docs.amazonwebservices.com/AmazonEC2/gsg/2006-06-26/setting-up-your-tools.html" onclick="javascript:urchinTracker ('/outbound/article/docs.amazonwebservices.com');">documentation</a>. You should have done this once already in order to log into an EC2 instance, so I won&#8217;t dwell on it. I found it is easiest to create a ~/.ec2 directory on my local machine, then transferring it to the remote instance with scp. I put this in my superuser account, and make it readable only by the superuser and its group. This makes setup a lot easier.</p>
<p>The next thing you need to do is set up the system so that it boots up and is running the programs. If you aren&#8217;t familiar with how to do this, you can check out the documentation for <a href="http://www.redhat.com/docs/manuals/linux/RHL-9-Manual/ref-guide/s1-boot-init-shutdown-process.html" onclick="javascript:urchinTracker ('/outbound/article/www.redhat.com');">the boot process</a> and <a href="http://www.redhat.com/docs/manuals/linux/RHL-9-Manual/ref-guide/s1-boot-init-shutdown-run-boot.html" onclick="javascript:urchinTracker ('/outbound/article/www.redhat.com');">running programs at boot time</a>. Beginners might have to modify the rc.local, but normally most configuration is taken care of during the application installation process.</p>
<p>Last, you will need to set up the file and directory permissions on the machine. Again, this is beyond the scope of this post, but there is plenty of <a href="http://www.dartmouth.edu/~rc/help/faq/permissions.html" onclick="javascript:urchinTracker ('/outbound/article/www.dartmouth.edu');">documentation</a> available. Sometimes this can be a bit more art than science, so consider your decisions here carefully.</p>
<p><strong><em>Bundling the Volume</em></strong></p>
<p>Once you have gotten through each of these steps, you should be ready to bundle a volume. Remember, what we are doing is taking a snapshot of the machine as it is in its current state and storing it for later use. The next time we run, we are going to have a machine that looks just the way it does now, so make sure everything is taken care of.</p>
<p>Now, I had planned on writing something simple on how to actually do the bundling, but AWS has already made <a href="http://docs.amazonwebservices.com/AmazonEC2/gsg/2006-06-26/creating-an-image.html" onclick="javascript:urchinTracker ('/outbound/article/docs.amazonwebservices.com');">very good instructions</a> available. It takes a little while to finish, but overall it works pretty well.</p>
<p>If you are having trouble, it is probably that:</p>
<ul type="disc">
<li class="MsoNormal">The EC2 software isn&#8217;t      installed.</li>
<li class="MsoNormal">The environment variables      aren&#8217;t set up properly.</li>
<li class="MsoNormal">You are providing incorrect      AWS account information.</li>
</ul>
<p>Understand that once the image uploaded to S3 you will have to pay for monthly storage for it, although it should not be particularly expensive.<br />
<strong>Conclusion</strong></p>
<p>Once you have created an AMI you should see it the next time you run the ec2-describe-images command. It is now available for your private use. Enjoy!
</p>
<div class="sociable"><span class="sociable_tagline"><strong>Share and earn some karma ...</strong><span>These icons link to social bookmarking sites where readers can share and discover new web pages.</span></span><ul>
	<li><a href="http://del.icio.us/post?url=http://www.architected.info/blog/establishing-an-ec2-presence&amp;title=Establishing+an+EC2+Presence" title="del.icio.us" onclick="javascript:urchinTracker ('/outbound/article/del.icio.us');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/delicious.png" alt="del.icio.us" /></a></li>
	<li><a href="http://digg.com/submit?phase=2&amp;url=http://www.architected.info/blog/establishing-an-ec2-presence&amp;title=Establishing+an+EC2+Presence" title="digg" onclick="javascript:urchinTracker ('/outbound/article/digg.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/digg.png" alt="digg" /></a></li>
	<li><a href="http://www.furl.net/storeIt.jsp?u=http://www.architected.info/blog/establishing-an-ec2-presence&amp;t=Establishing+an+EC2+Presence" title="Furl" onclick="javascript:urchinTracker ('/outbound/article/www.furl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/furl.png" alt="Furl" /></a></li>
	<li><a href="http://www.newsvine.com/_tools/seed&amp;save?u=http://www.architected.info/blog/establishing-an-ec2-presence&amp;h=Establishing+an+EC2+Presence" title="NewsVine" onclick="javascript:urchinTracker ('/outbound/article/www.newsvine.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/newsvine.png" alt="NewsVine" /></a></li>
	<li><a href="http://reddit.com/submit?url=http://www.architected.info/blog/establishing-an-ec2-presence&amp;title=Establishing+an+EC2+Presence" title="Reddit" onclick="javascript:urchinTracker ('/outbound/article/reddit.com');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/reddit.png" alt="Reddit" /></a></li>
	<li><a href="http://www.spurl.net/spurl.php?url=http://www.architected.info/blog/establishing-an-ec2-presence&amp;title=Establishing+an+EC2+Presence" title="Spurl" onclick="javascript:urchinTracker ('/outbound/article/www.spurl.net');"><img src="http://www.architected.info/blog/wp-content/plugins/sociable/images/spurl.png" alt="Spurl" /></a></li>
</ul></div>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/establishing-an-ec2-presence/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
