Architecture and Speed

July 27th, 2006 by morgan

BitWorking has an interesting post that talks about data from a document-centric point of view. He makes an eloquent defense of the peopole who build and use shadow systems when he says:

“I am not attacking relational databases, they have their role to play and are very useful tools, nor am I attacking people that use databases. What I am doing is taking people to task that berate users for putting too much data into spreadsheets and not databases. We, as software developers, have let them down and not provided them with tools that work with how they’ve been trained to work with the data, the failing is on our side, not theirs.”

Sometimes we forget that the process of conceptualizing, defining, and implementing our information architecture has to move as fast as the organizations that we are supporting. If they don’t, then we force people to do an end around to get things done. This is a form of innovation, although it is one that can cause serious woes around information quality and systems integration, especially in a corporate environment.

Keep your architecture and practices lean, mean, and practical and be a partner to your entire organization. Don’t let your work be the thing that holds someone back from productivity!

Doing Some Work on the Site

July 27th, 2006 by morgan

Folks,

I have been doing some work on the site, both the blog and the wiki. Mostly I have added some new sub-categories as a part of understanding section, and have started to try and find a better home for more static content. Also, I have had to secure the wiki to prevent wiki spam. So, if you are interested, please let me know if you are interested in adding something and I will get you an account.

Morgan

The Information Quality Pyramid

July 24th, 2006 by morgan

I have been working on some more detailed articles for the wiki to help illustrate some ideas about information quality. While I don’t want to just duplicate that article here, I thought I would post some things on the blog and get some feedback.

I am currently working on the Information Quality Pyramid, which discusses the various components that go into improving information quality across an organization:

IQ Pyramid

The pyramid is made up of several parts, each of which are important in their own right. However, the base components (in blue in green) have the interesting combination of being very important, terribly inexpensive, and totally unglamorous.

Understanding Your Organization – The single most important thing that you can do to ensure success in any information quality effort. Without a solid understanding of how your organization works it is virtually guaranteed that you will not be able to deliver the solution your customers need. This (coupled with the need for extreme customization) is one of the reasons that it is very difficult to outsource this type of work.

Architecture and Design Practices – To put it bluntly, if you build your information architecture in an inconsistent manner then you have to expect inconsistencies in its output. These inconsistencies become quality-related issues very quickly. If you can proactively address (or at least mitigate issues around) consistency through your architecture then you can dramatically improve the quality of information that you produce.

Automation – The key to high-value, high-quality information architecture is automating everything possible. This is because:

  1. Moore’s Law will double the speed of computerized processes every two years. It is pretty tough for humans to keep up.
  2. Humans make mistakes.
  3. In order to automate a process, it has to be understood by more than just the designer or the developer.

Sanity ChecksThe easiest and most cost effective ways to catch issues before they become problems.

Data Profiling The only way to understand your information is to know your data. Intimately. Regularly. Historically. Profiling takes a generic look at an arbitrary dataset and discovers important statistical information about it. Profiling is by far the cheapest and most reliable way of examining data (think of it as an expanded sanity check).

Process Testing – Instead of looking at a dataset in a generic way, process testing looks at things in a very specific way. These should be customized tests that will tell information that are automated and deliver results that are unique to the process. Because of the level of customization and effort, this is significantly more expensive than profiling or sanity checks.

Human Intervention – Anything that involves humans, from adjusting processes already in production to performing manual analysis to resolve concerns to creating new code. Think of it as if all of information quality was outsourced to a 3rd party company and all personnel costs came directly out of your budget. This is the true cost of IQ, it is just that people see it in a more abstract sense.

The one category that I can see people might think is missing here is metadata. I think metadata is an incredibly important part of information quality, but I tend to value it in its most concrete form instead of in the abstract. I will get into this more in the wiki article.
Any feedback would be most appreciated!

technorati tags:, , , ,

The Value of Collaboration

July 22nd, 2006 by morgan

The folks at semoz.org had a great post about the value of collaboration. The pearl of wisdom that really got me chuckling was:

Group intelligence is multiplicative when idiots are involved - combining a half-wit with another half-wit does not result in a full-witted person, it results in a quarter-witted person (1/2 x 1/2 = 1/4). Combining a full-witted individual with a half-wit still only yields a half-wit. The more of these “wrong kinds of people” you have involved in the process, the worse things get.

I am all for collaboration, but there comes a point (and it comes surprisingly quickly) when there is no need for additional opinions, especially from the wrong people.

The Revenge of the Sys

July 20th, 2006 by morgan

Tim O’Rielly wrote an excellent piece about the newfound interest in well-run operations for Microsoft and other companies hoping to be a part of the Web 2.0 revolution. I think this is a rich reward for systems administrators and operators, two professions of vital importance that seemed very underappreciated (some might say marginalized) as of late.

One of the more interesting parts of the article discussed the value of integration:

Internet-scale applications are pushing the envelope on operational competence, but enterprise-class applications will follow. And here, Microsoft has a key advantage over open source, because the Windows Live team and the Windows Server and tools team work far more closely together than open source projects work with companies like Yahoo!, Amazon, or Google.

If this vision is true then I don’t think it bodes well for companies that have outsourced almost all of their technical operations.

about


Architected.info is a web site dedicated to information architecture, focusing on transformation and understanding. We focus on these categories through the lens of organizational dynamics, looking at people, practices, and relationships.

Morgan Goeller is the author and maintainer of this website. He has worked as an architect and engineer, specializing in software development, web applications, database engineering, ETL, and information quality.

search

navigation

archives

categories