April 12th, 2007 by morgan
GigaOM has an interesting article about the impact of web 2.0 on network engineers. Namely, that the maturation of the internet has made the skills of a good network person a lot less important:
I see the current state of the Internet as the ultimate success … You can deploy a wildly successful Web 2.0 application that serves millions of users and never know how a router, switch or load-balancer works. Even network security and firewalls that were making headline news not more than a few years ago are considered perfunctory. The success of these networking devices and technologies has enabled them to become part of the technology landscape that exists for all to use as they see fit, similar to the microprocessor or electricity.
It is always odd to see the once-glamorous jobs of your youth thrown onto the scrap heap of history (think about the differences in perception between the masons of the middle ages and your local bricklayer). Network Engineers were once the masters of a difficult and arcane field, literally bringing information from chaos. Now, the wizards have been trapped in tiny control panels for now, until they can be embedded in silicon for all time.
This has really got me to thinking about my own field, and its future. What specialties are going to dissapear if data becomes as reliable as electricity? For one thing, I think we would see ETL and Business Analysis become a single career path that is much more abstract and tools-based. With the advent of good BPM, I could see a lot of the scheduling and other mechanics pushed off towards the DBA’s and Systems Administrators. Also, I think that a lot of the hardware could be appliance based, or outsourced completely. Of course, this leaves a great opportunity for open source BI and for nimble players to attack the market and take advantage of the innovators dilemma.
A brave (an infinitely more useful) new world!
Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
Posted in Databases, ETL, Transformation, Business Intelligence, Enterprise Web, Appliances | No Comments »
January 3rd, 2007 by morgan
Steve Tuck from Datanomic has an post about data quality on dq:view, where he discusses (and tries to dismantle) the use of a government produced master data file for mailing addresses in the UK. While the posting is very specific to a single application, it speaks to a situation that drives a lot of data management issues.
He writes:
Authorative sources of data are indeed useful - just don’t count on them to tell the truth, the whole truth and nothing but the truth.
I believe that one of the biggest problems that we have in dealing with data is the false belief that for every organization and situation, there is a single view of information that can satisfy everyone’s needs. Now, this isn’t a technology problem and it isn’t a data problem, it’s an organizational problem.
The Myth of the Single View
In any organization, we end up with different groups with different needs, normally based around:
- Speed
- Reliability
- Accuracy
- Cost
Each group has specific needs based on their own situation. For example, when looking at customer data, the people in HQ might not care if every customer account has the most up-to-date address available, but the people in the warehouse certainly do. At the same time, the people in the warehouse don’t care about how much it costs to , while the people in HQ are much more focused on the bottom line.
Get these folks together in a room and you will have a terrific argument about what the organization needs and and how it is going to be done (BTW, there is a related post to this on the wonderful Creating Passionate Users).
While this sounds like a problem for human resources or general management, this phenomenon is usually expressed as a function of IT, because that is where the rubber hits the road. Since IT is often a shared resource and has a vested interest in interoperability, the issues of culture and organization come out as a function of architecture development.
An Honest Assessment
The honest truth is that there isn’t a single view of the business, its data, or its processes, that is going to meet the needs of the entire organization. A lot of vendors and consultants for CRM and MDM solutions are going to try to tell you otherwise, realize that they are selling something as they do this. The answer is that this is a complicated world, and things aren’t getting any easier.
If your IT is going to represent the entire organization, you must embrace complexity and understand the fact that there are going to be a cacophony of voices and a host of diverse world views that all exist simultaneously and are all using and competing for the same resources.
Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
Posted in Databases, ETL, Information Architecture, Information Quality, Relationships, Understanding, Culture | No Comments »
December 8th, 2006 by morgan
I have been working with Amazon Web Services (and EC2) a lot lately, and have made some observations that really fly in the face of conventional wisdom.
I work in ETL, which means I need to get a hold of big iron to crunch on big data. Machines are expensive, licenses are expensive, storage and networks are cheap. Scalability is important, but measured.AWS would be perfect for sourcing ETL jobs that are one-offs or are particularly large or complex. However, the major vendors are very particular about making sure that their products are only installed on authorized machines. They make it pretty difficult for you to cluster easily, especially if you are a little guy just starting out.
This is antithetical to the AWS approach to problems. Here, machines and storage are dirt cheap, networks are pretty cheap, and scalability is paramount. The most difficult thing is arranging a problem so that it can be worked on by your infinite monkeys, in the form of EC2 instances. The biggest problem then becomes licensing.
In a highly scalable environment, it is incredibly compelling it becomes to use easily licensed software. Compelling to the point where it becomes worth it to build your own tools instead of purchasing off the shelf. For example, for a web server I could use Apache or Websphere. Apache is free, and I can install it on my instance with absolutely no problems (as a matter of fact, it is pre-installed). With Websphere I am going to have to purchase a license (or more), then monkey with the fact that it will be installed on a new machine with a new hostname each time. You can make the same argument for MySQL vs. Oracle, or Python vs. .NET.
Now this isn’t an anti-corporate rant, not by a long shot. But, I think it is a valid way to look at how licensing will be a competitive advantage in the future. Software vendors should start looking at their products in terms of AWS and other compute farms, especially at the enterprise level. Those who don’t get out in front of this are going to find their lunches eaten, and quickly. There is quite a hype around Web 2.0 companies these days, this could be a great way for someone to get their foot in the door of the Fortune 1000.
Perhaps Richard Stallman should send a Christmas card from the bazaar to Jeff Bezos over at the cathedral this year …
Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
Posted in ETL, Information Architecture, Systems Integration, Over the Horizon, Appliances, AWS | No Comments »
September 18th, 2006 by morgan
James Taylor (no, not that James Taylor, the other one) had an interesting article about SOA’s, agility, and architecture. While the article is a riff on another article (which makes this a meta-riff, I suppose) , it got me to thinking about the development lifecycle.
I think it is very ironic that in ETL and data-oriented programming we run into the same contradictions all the time:
- Development time is the smallest cost in the entire process in terms of time, resources, and money.
- Software development is scrutinized to death.
- On-time delivery is significantly more important than long-term cost savings, even if it impacts long-term functionality.
Now, I don’t think this is done out of malice or spite for IT. A lot of it may simply be because development is the one part of the development lifecycle that can be influenced by the project sponsor. However, as practitioners we need to make sure that information architecture is focused on consistently delivering tangible value to our organization. This means effectively communicating the true overall cost for systems development and making sure that the organization as a whole understands what we are doing.
Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
Posted in ETL, Information Architecture, People, Practices | No Comments »
September 15th, 2006 by morgan
Classifying ETL
It will help to take a bit of time to discuss how software development is classified. Historically, classification of software development were done around methodology and/or representations. Some common ways to look at development are:
Looking at things through the lens of methodology is a more academic view of things, and more prevalent in the early days of computing.
Another Way to Look at Things
Practitioners often look at things a bit differently, often through the functionality of what is being created. Some ways to look at development this way are:
- Web Programming (like PHP, AJAX, DHTML, etc)
- Glue Programming (PERL, Python, TCL, and too many scripting languages to list)
- UI Development (TK, XUL, UIML)
- Mathematics (MatLab, SAS, R, many others)
This is a more practical view of things, more prevalent today, especially in the IT world.
Where We Fall
ETL is function, so they are most easily classifed in a functional way. However Data Oriented Programming is more of a methodology (although more of a hybrid than anything else). So, it is tough to encompass this in just one category. It probably makes most sense to say that ETL should be viewed from the functional point of view, while the things that are used to build ETL processes should be viewed from a methodological point of view.
Next in the “Focus on ETL” series we will be looking at what goes into an ETL process.
Share and earn some karma ...These icons link to social bookmarking sites where readers can share and discover new web pages.
Posted in ETL, Information Architecture, Transformation | 1 Comment »
|

This is the about me section, you will prob. want to edit this. If you want to change the image you may do so by changing the avatar.jpg located in the NewZen images directory.
|