Are recommendation systems based on theory or pure data mining?

One interesting thing about economic research was the importance of theorizing human behavior (although humans are always profit & benefit maximizing and cost-minimizing in econ models) before analyzing the data. It was actually “verboten” to use data mining techniques to identify a pattern, and then tell the story, because 1) econ has serious p-envy (physics envy) and likes to follow the scientific method of theory first, test with data next, and 2) because its easy to come up with some convoluted “theory” AFTER finding data to support it, so if you want to be “rigorous” it was not good. 

Well, that was ivory tower economics.  But in the real world, recommendation systems for one, has proven that using data mining to identify patterns without any preconceived human behavioral theories can be very powerful.  Amazon’s most identified with the “if you liked this, maybe you’ll like this” feature, and its hard for me to think that there are robust behavioral theories that can be applied to such a multi-dimensioned thing as book purchases.

My personal theory (at 4am in the morning) is that the reason recommendation systems were the first applications of “the community” on the web, was that a lot of online stores found themselves with treasure troves of user purchasing and site browsing data, something that offline marketers would never see. So they just threw computing power against the massive data set, and see what they got.

And I think early recommendation systems (and Walmart’s latest fiasco reported by NYT) shows that by and large these are pure data-mining applications, as opposed to something with a pre-conceived theory on human behavior, or people’s tastes.

Since Greg Linden, is writing about Amazon’s early days, thought I would ask him if my half-ass conjectures are correct. Will update this post if he responds.

Leave a Reply