Encouragement for B.I. geeks

Published Tue, Sep 18 2007 7:34 | William

For the past month, I've been bouncing back and forth between the east and west coast.  The side-effects of spending excessive amounts of time in an airplane are that  I really am starting to hate 'modern parents' and I'm reading even more than I normally do.  I'm unable to discuss modern parents that think their kids should never hear the word "NO" without ranting and raving like a nut so I'll dispense with that part.

I really don't dig reading Fiction so I typically go back and forth between programming books and business books.  The latest one I picked up is titled Super Crunchers Why Thinking-By-Numbers is the new way to be smart.  The book is basically a conglomeration of stories of people that obsessively use quantitative methods  to understand the world. In just about each case, the number crunchers end up at odds with folks that think true understanding is a function of experience.  Most of the cases he shows draw  a stark distinction between the two sides even though there's no necessary mutual exclusivity between the two.  One of the really cooler stories is about this guy who can tell the quality of wines and how much they'll cost even though he doesn't drink the wine and uses purely mathematical methods. Much of the book also defends using quantitative methods to analyze things even though many times in the past this has led to huge failures.  In each case, the author shows that it's not a problem with using numbers primarily but by using inadequate models or misapplying models.

Basically, it works something like this.  If an activity happens for any period of time, patterns will start to develop.  Even in cases where things appear to be random, if you look long enough and close enough, you'll find that even things that are random start behaving in not so random ways.  This applies to pretty much everything so given enough data, you can probably find predictors that are accurate beyond belief.

In the past, this sort of analysis has failed b/c of a few reasons:

1- Because getting an adequate and representative set of data was often difficult to impossible, people used data that was less than optimal. Technology however has made data collection a lot cheaper and more accurate so there are more and more workable data sets that you can use to analyze.

2- Technology is also allowing data to be collected with much more accuracy and precision. This in turn increases the predictive value of models

3- Increasingly, there is data available about just about anything and b/c of the internet and search engines, the price of acquiring this data is often relatively inexpensive

In many cases though, it's hard to get a good enough data set with just a few thousand observation points (although sampling mitigates this).  Because it's cheaper and cheaper to acquire and store data, more data can be analyzed so standard data storage isn't good enough in many cases.  Data Warehousing is becoming more common and techniques used to analyze it are starting to become mainstream enough for true 'best practices' to start to be developed. And now, even individuals with virtually no budget and little more than a computer  and internet connection can do things analytically that were only in the domain of large research institutions 20 years ago. That means more people will be doing things on their own that died in the past b/c of lack of interest/funding/resources.  It's the Army of Davids or the Open Source model but with data.

My guess is that by the time you hit Chapter 2 in this book, he'll have hit you with enough evidence to make you really start to think about data warehousing and Business Intelligence a lot more seriously.  The really intriguing part though is that the data warehousing aspect is really the easy part.  The more difficult part is the mathematical analysis... knowing what to run, how to run it, weaknesses of different approaches etc.

I remember being in junior high in Algebra 3 talking to my buddies and we were bitching about all the homework.  "How many times are we going to use Matrix algebra in our day to day lives once we get out of school?' was almost cliche and very indicative of where our heads were at.  I got through it b/c I had to but my heart wasn't in it and I didn't do anything more than I had to.  Wisdom sounds foolish to fools and 20 years ago, I knew at least one world class fool.

Comments

# Theo said on September 18, 2007 6:22 PM:

Aw, shucks...I'd love to hear you comments on parents! :-)

# CC Watcher said on September 19, 2007 1:44 PM:

I know you've went soft on the subject of Charles Mark Carroll (although someone pretending to be him sure leaves lots comments) but I came across this and thought you'd be interested.  www.charlescarroll.blogspot.com

It's from someone who went to his training courses and was pretty close to him - but who he's turned on.   It's like http://inlineasp.blogspot.com but takes a 'just the facts' approach that only a woman can provide.

# Todd Leloup said on September 20, 2007 7:30 AM:

Hi Bill,

Just read this and it reminded me of the Personal Software Process (PSP). It is a method developed by SEI and the DOD to apply this kind of data gathering and analysis to writing software. It is mainly to improve development time estimations and by a side effect quality. It does require large amounts of effort to gather the data, but if done correctly provides some amazing results.

If you are interested: www.sei.cmu.edu/.../psp.html

Search

This Blog

Tags

Community

Archives

News

My other sites

Cool Stuff

Book Stuff

Security

ORM

Data Access

Funny Stuff

Compact Framework Stuff

Web Casts

My KnowledgeBase Articles

My MVP Profile

Design Patterns

Performance

Debugging

Remoting

My Fellow Authors

My Books

LINQ

Misc

Speech

Syndication

Email Notifications