A Very Useful Engine

March 12th, 2007 by morgan

A while back, I wrote about how useful it would be to be able to combine collaborative information and structured data. While Wikis and other collaborative information sources are great, I would argue that they aren’t useful until they can be used in aggregated or statistical form for strategic decision making or automation. Until then, they are too “abstract” to be useful (at least in the mechanical sense of the word).

Recently, ran across DBPedia, an organization that is turning Wikipedia entries into RDF, the language used for the semantic web. DBPedia has actually has downloadable datasets based on Wikipedia that are available today. These are datasets that can be queried with existing tools and linked to other datasets. Wow!

Even if you aren’t a Wikipedia fan, this is really a big step forward for the enterprise. Think about the amount of knowledge that exists in your organization that isn’t captured, but is critical to your operations. It has always been a big pain to try and sit down and do formal knowledge engineering. However, most people are comfortable enough with a Wiki to sit down and start typing. For a small organization this might not be such a big deal, but for a larger enterprise this could provide some very useful information.

The first time your Director or CXO can make a financial decision based exclusively on the information from your company wiki, it will have proved its worth. Until then, it is just another trendy tool. The work that DBPedia is doing is an important step in making this a reality.

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

Drowning In Data

March 6th, 2007 by morgan

OK, when USA Today has a story about information management you can be sure that the phenomenon is big enough that it will impact non-techies on a large scale.

When tech analyst John Gantz at researcher IDC began tallying up all the digital information generated annually, he first looked in the obvious places …

Gantz ultimately calculated that 161 exabytes of digital data — or about 161 billion GB — were generated in 2006. And the amount is expected to rise fast.

It is worth a quick read, although if you are a data geek a lot of this is pretty elementary, so you might just want to forward the URL on to your favorite business leader or project manager.

The article is a bit light on is the downstream ramifications of the data deluge. Obviously, there will be a lot of SAN units sold, but that opportunity is long gone unless you are able to take advantage of the innovators dillema. Looking over the horizon, there is a huge opportunty for people who can make this data not just searchable, but accessible and usable and trustworthy for decision-making.

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

Next Generation Web Hosting — Media Temple

February 12th, 2007 by morgan

Media Temple is a very cool evolution in remote hosting. Part web host, part application server, part grid, it is an interesting look into the parallell computing world we are rapidly moving into.

Most interesting to me was their grid service, which provides an on-demand capability for web hosting that allows a site to handle the slashdot effect without having to blink an eye. Sites (and their corresponding media and applications) are running on multiple servers, which allows traffic to be spread out seamlessly, allowing for spikes in service and usage. This is all done without significant additional configuration, which makes it all the more sweet.

Now, they have had some problems, especially with non-grid oriented applications. However, I think that these are pretty minor compared to the utility that high-performance sites will get from using a grid environment.

Absolutely worth a look …

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

Using S3 as a File System

January 17th, 2007 by morgan

Openfount has released S3Infindisk for EC2, a product that answers a lot of wishes in the EC2 community. One of the biggest issues with EC2 is the lack of persistent storage on the server instances, and this tool is a first attempt at a solution.

Basically, when you are using EC2, everything on your server that isn’t statically defined before the machine runs goes bye-bye as soon as the machine reboots. Not a real problem for software, but for data this is a major bummer. Amazon makes S3 storage available without data transfer fees, which is wonderful. However, it takes real effort to transfer data back and forth between the systems (with something like jsh3ll), and most data-centric tools (especially database servers) expect real-time access to a working file system.

S3Infinidisk bridges this gap, allowing an EC2 instance to use S3 like a real Linux filesystem. While it is a bit of hack, it allows data tools to work the way they need to and it allows an EC2 instance to take full advantage of the AWS environment. This is a huge step forward in making EC2 a more usable environment for utility computing!

I haven’t had a chance to try the product out yet, but am excited to do so. I appreciate the licensing structure (free single-user version + commercial high-performance version), although I would prefer seeing open source. Also, since the tool is based on the FUSE subsystem, I could easily see this spreading like wildfire.

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

Interesting Features of Google Docs & Sheets

December 18th, 2006 by morgan

I have been working with Google Docs and Sheets lately, in order to avoid the portability problem when working at different machines and locations. While it isn’t as fully featured as Excel, it does just about everything I need it to do, and then some. Plus, it adds in the collaboration features that are almost more useful to an internet-oriented business.

It would be incredibly boring for Google simply to replicate Excel and Word in a web format, unless you are an HTML groupie. However, there are some very, very interesting features that I think really turn the traditional office application on its ear. The first thing that caught my eye was the Google Lookup function, which allows one to incorporate search information dynamically into documents. The second thing was the Google Finance function, which allows financial information to be leveraged as well.  The third thing was the ability to embed portions or entire spreadsheets into a blog or web page.
Very cool stuff, and very interesting results. One could imagine this type of thing being leveraged with Froogle, Maps, or other service, within a document, presentation, or spreadsheet. Low cost, high reward stuff. However, there are some ramifications with using this type of information. For example, the spiffy new spreadsheet you put together for your boss could be modified by outside influences (like a Google Bomb).

Worth a look, at least.

Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • NewsVine
  • Reddit
  • Spurl

about


Architected.info is a web site dedicated to information architecture, focusing on transformation and understanding. We focus on these categories through the lens of organizational dynamics, looking at people, practices, and relationships.

Morgan Goeller is the author and maintainer of this website. He has worked as an architect and engineer, specializing in software development, web applications, database engineering, ETL, and information quality.

search

navigation

archives

categories