<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
		xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
	xmlns:media="http://search.yahoo.com/mrss/"
>

<channel>
	<title>Architected Information &#187; Uncategorized</title>
	<atom:link href="http://www.architected.info/blog/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.architected.info/blog</link>
	<description>Just another WordPress site</description>
	<lastBuildDate>Tue, 21 Feb 2012 22:31:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<copyright>Copyright &#xA9; Architected Information 2011 </copyright>
	<managingEditor>morgango@gmail.com (Architected Information)</managingEditor>
	<webMaster>morgango@gmail.com (Architected Information)</webMaster>
	<image>
		<url>http://www.architected.info/blog/wp-content/plugins/podpress/images/powered_by_podpress.jpg</url>
		<title>Architected Information</title>
		<link>http://www.architected.info/blog</link>
		<width>144</width>
		<height>144</height>
	</image>
	<itunes:subtitle></itunes:subtitle>
	<itunes:summary>Just another WordPress site</itunes:summary>
	<itunes:keywords></itunes:keywords>
	<itunes:category text="Society &#38; Culture" />
	<itunes:author>Architected Information</itunes:author>
	<itunes:owner>
		<itunes:name>Architected Information</itunes:name>
		<itunes:email>morgango@gmail.com</itunes:email>
	</itunes:owner>
	<itunes:block>no</itunes:block>
	<itunes:explicit>no</itunes:explicit>
	<itunes:image href="http://www.architected.info/blog/wp-content/plugins/podpress/images/powered_by_podpress_large.jpg" />
		<item>
		<title>Data As A Utility</title>
		<link>http://www.architected.info/blog/data-as-a-utility-2/</link>
		<comments>http://www.architected.info/blog/data-as-a-utility-2/#comments</comments>
		<pubDate>Thu, 12 Apr 2007 02:29:23 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.architected.info/blog/?p=81</guid>
		<description><![CDATA[GigaOM has an interesting article about the impact of web 2.0 on network engineers. Namely, that the maturation of the internet has made the skills of a good network person a lot less important: I see the current state of the Internet as the ultimate success … You can deploy a wildly successful Web 2.0 application that serves millions [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://web.archive.org/web/20081007122737/http://gigaom.com/">GigaOM</a> has <a href="http://web.archive.org/web/20081007122737/http://gigaom.com/2007/04/10/web-20-death-of-the-network-engineer/">an interesting article</a> about <a href="http://web.archive.org/web/20081007122737/http://gigaom.com/2007/04/10/web-20-death-of-the-network-engineer/">the impact of web 2.0 on network engineers</a>. Namely, that the maturation of the internet has made the skills of a good network person a lot less important:</p>
<blockquote><p>I see the current state of the Internet as the ultimate success … You can deploy a wildly successful Web 2.0 application that serves millions of users and never know how a router, switch or load-balancer works. Even network security and firewalls that were making headline news not more than a few years ago are considered perfunctory. The success of these networking devices and technologies has enabled them to become part of the technology landscape that exists for all to use as they see fit, similar to the microprocessor or electricity.</p></blockquote>
<p>It is always odd to see the once-glamorous jobs of your youth thrown onto the scrap heap of history (think about the differences in perception between the masons of the middle ages and your local bricklayer). Network Engineers were once the masters of a difficult and arcane field, literally bringing information from chaos. Now, the wizards have been trapped in tiny control panels for now, until they can be embedded in silicon for all time.</p>
<p>This has really got me to thinking about my own field, and its future. What specialties are going to dissapear if data becomes as reliable as electricity? For one thing, I think we would see ETL and Business Analysis become a single career path that is much more abstract and tools-based. With the advent of good BPM, I could see a lot of the scheduling and other mechanics pushed off towards the DBA’s and Systems Administrators. Also, I think that a lot of the hardware could be appliance based, or outsourced completely. Of course, this leaves a great opportunity for <a href="http://web.archive.org/web/20081007122737/http://www.pentaho.org/">open source BI</a> and for nimble players to attack the market and take advantage of <a href="http://web.archive.org/web/20081007122737/http://www.businessweek.com/chapter/christensen.htm">the innovators dilemma</a>.</p>
<p>A brave (an infinitely more useful) new world!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/data-as-a-utility-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Two Methods for Defining Information Quality</title>
		<link>http://www.architected.info/blog/two-methods-for-defining-information-quality/</link>
		<comments>http://www.architected.info/blog/two-methods-for-defining-information-quality/#comments</comments>
		<pubDate>Mon, 31 Jul 2006 02:22:59 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.architected.info/blog/?p=63</guid>
		<description><![CDATA[In Information Science today two competing methods for indexing information: semantics and statistics. While this may not seem to have a lot to do with information quality, bear with me and I promise I will link them up (eventually). Both methods approximately the same job, that is to allow information to be read and manipulated by machines on a grand [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://web.archive.org/web/20081012150441/http://en.wikipedia.org/wiki/Information_science">Information Science</a> today two competing methods for indexing information: <strong>semantics</strong> and <strong>statistics</strong>. While this may not seem to have a lot to do with information quality, bear with me and I promise I will link them up (eventually). Both methods approximately the same job, that is to allow information to be read and manipulated by machines on a grand scale. The difference is in how this is done.</p>
<ul>
<li>A <a href="http://web.archive.org/web/20081012150441/http://www.architected.info/blog/getting-started-with-information-quality-1-of-2"><strong>semantic approach</strong></a> would have the author define concepts and relationships ahead of time. You can see some examples in <a href="http://web.archive.org/web/20081012150441/http://infomesh.net/2001/swintro/">this tutorial</a>, as they are long and would be difficult to reproduce here. The <a href="http://web.archive.org/web/20081012150441/http://en.wikipedia.org/wiki/Semantic_Web">Semantic Web</a> would be a good example of this methodology.</li>
</ul>
<ul>
<li>A <a href="http://web.archive.org/web/20081012150441/http://www.architected.info/blog/case-study-statistical-information-quality"><strong>statistcal approach</strong></a> would simply look at the text that was available and try to determine what is there and how it relates to other things through textual analysis and aggregation. <a href="http://web.archive.org/web/20081012150441/http://www.google.com/">Google</a> is a good example of the use of this approach.</li>
</ul>
<p>The semantic way of looking at things is very abstract and much more rigorous. It says that there is a truth to be represented, it designs a way of doing it, and expects everyone to follow along. The statistical way of looking at things is much more flexible. It says that there are things to be gleaned regardless of form, and that we should accept this fact and try to make the best of things. Not surprisingly, the semantic approach is the favorite of academia and has been under development for many years, while the statistical approach is already in real-world use.</p>
<p>What got me thinking about this in the first place was the latest issue of <a href="http://web.archive.org/web/20081012150441/http://www.baselinemag.com/">Baseline</a>. Specifically, it was <a href="http://web.archive.org/web/20081012150441/http://www.baselinemag.com/article2/0,1397,1985493,00.asp">an article</a> from <a href="http://web.archive.org/web/20081012150441/http://www.strassmann.com/bio.php">Paul A. Strassman</a> titled, “How Clean Data Can Transform Your Business”. Normally Strassman’s stuff is pretty good, but it is helpful to note that Strassman is a senior consultant to the Department of Defense and has been in the business for a long, long, long time.</p>
<p>The crux of his argument was that:</p>
<blockquote><p>The first step in business transformation: enterprisewide standardization of data. That calls for the declaration of a metadata directory as the template for defining data that can circulate within a firm’s information systems. The policy and implementation of an enforceable metadata directory likely will be resisted by bureaucrats, who see this as a threat to their indispensability. It will not be welcomed by systems developers, contractors and vendors, who prefer to concentrate on upgrading software as a technologically more interesting—and profitable—task.</p></blockquote>
<p>A classic argument for a semantic model of truth. We just need to get everything defined and then it will be smooth sailing from there. For most vendors and consultants, the semantic view is the accepted one, probably because it is so structured and logical, although at least partially because it all those hours spent defining concepts are billable. Even Strassman acknowledges this reality …</p>
<blockquote><p>To reach agreement on the representation, semantics and taxonomy of data, you will likely go through a painful political process that must be adjudicated by line management. This can get messy because it will reveal that a large percentage of installed software perpetuates incompatible, unreliable, insufficiently secure and delayed information.</p></blockquote>
<p>With this in mind, is <a href="http://web.archive.org/web/20081012150441/http://www.architected.info/blog/getting-started-with-information-quality-1-of-2">semantic definition</a> the most efficient way to improve information quality? Is a <a href="http://web.archive.org/web/20081012150441/http://www.architected.info/blog/case-study-statistical-information-quality">statistical definition</a> the most descriptive way to understand information quality? We will explore the basis for both of these methods in the <a href="http://web.archive.org/web/20081012150441/http://www.architected.info/blog/testing-information-quality">next part of this series</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/two-methods-for-defining-information-quality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Challenges of Real-Time</title>
		<link>http://www.architected.info/blog/the-challenges-of-real-time/</link>
		<comments>http://www.architected.info/blog/the-challenges-of-real-time/#comments</comments>
		<pubDate>Fri, 09 Jun 2006 02:05:41 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.architected.info/blog/?p=44</guid>
		<description><![CDATA[A lot of my recent work has been in real-time (actually near real-time) data warehousing. There are some real challenges for ETL andinformation quality when moving towards a real-time environment. Everything seems to become more dificult, and at times the constraints become almost unbearable to work with. You really, really, really need a real-time system in order to [...]]]></description>
			<content:encoded><![CDATA[<p>A lot of my recent work has been in real-time (actually near real-time) data warehousing. There are some real challenges for <a href="http://web.archive.org/web/20081121132821/http://en.wikipedia.org/wiki/ETL">ETL</a> and<a href="http://web.archive.org/web/20081121132821/http://en.wikipedia.org/wiki/Data_quality">information quality</a> when moving towards a real-time environment. Everything seems to become more dificult, and at times the constraints become almost unbearable to work with. You really, really, really need a real-time system in order to justify building one, especially from a data-centric point of view.</p>
<p>What got me writing about this was reading an that “<a href="http://web.archive.org/web/20081121132821/http://chron.com/disp/story.mpl/business/3945008.html">some Cingular subscribers endure 4-hour-plus outage</a> (and the fact that <a href="http://web.archive.org/web/20081121132821/http://www.forbes.com/2006/03/07/cingular-att-bellsouth-cx_de_0307autofacescan11.html">this isn’t the first time this has happened</a>). I knew exactly what the Cingular representative was talking about when I read this quote …</p>
<blockquote cite="http://chron.com/disp/story.mpl/business/3945008.html"><p>“There’s a database that has all the customer numbers and somehow, we don’t knowwhy at this point, about 10 percent (of customers in the area) were prohibited from making or receiving calls,” Merriman said.</p></blockquote>
<p>The big issues around real-time systems are in dealing with <a href="http://web.archive.org/web/20081121132821/http://en.wikipedia.org/wiki/Emergence">emergence</a> within the system. Things get into an unexpected state, and it is very difficult to figure out why, especially after the fact. This is because when are running in real time:</p>
<ol>
<li>Resources are at a premium, and often this means that only enough data is kept in order to process what is available <strong><em>right now</em>.</strong></li>
<li>Data handling is set up to ensure that the system doesn’t break, not to ensure optimal quality.</li>
<li>Downtime usually means there is normally no information coming in. It is usually very hard to know what you know you don’t know.</li>
<li>Breakage is normally catastrophic and the priority is on getting<br />
things going again, not performing detailed analysis on what happened.</li>
</ol>
<p>Because you have a lot less information than in a batch-processing type system it is a lot harder to figure out what is going on. Good luck to the <a href="http://web.archive.org/web/20081121132821/http://www.cingular.com/">Cingular</a> engineers in preventing this type of thing in the future <img src="http://web.archive.org/web/20081121132821im_/http://www.architected.info/blog/wp-includes/images/smilies/icon_wink.gif" alt=";-)" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.architected.info/blog/the-challenges-of-real-time/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

