Monday, August 06, 2007

Can we discover buzz patterns from Blogs?

The huge Consumer Generated Media ( CGM ) or User Generated Content ( UGC ) available in terms of blogs, social networks , public Wikis and other Internet based content stores have always inspired me to find out the answer of the following question

Can they be used to determine a trend or buzz for a specific business entity ( ex. Products like Apple iPhone or any TV soap like The Prison Break) ?

A typical example is the following blog post on a restaurant named “Mainland China”
http://bangalore.metblogs.com/archives/2006/12/dining_out_mainland_china.phtml
and comments from lots of blogosphere users on that post,
Now the question is whether this small fragment of user generated content can act as a piece of "gyaan" which can be effectively searched and therefore seamlessly discovered on the Web?

When I searched google for "Mainland China Bangalore" or "Dining Out: Mainland China" the above mentioned page came up as one of the top results. Now this is something that the internet users do regularly , though among all the internet users only a few come up with good search phrases that ensure contextual results. The popular search engines cannot take into account the meta context information which can possibly be best defined by the intent with what the author has written a specific blog post. Possibly the title of the post can be taken as one important context for any keywords that we index from a blog.

All those important information posted by the bloggers and other CGM providers will effectively be lost if we don't bundle them with a specific context ( As. Tagging ) or make them searchable
( Indexing ).

Some of us have already seen the Blog Buzz implementations , these are typically user driven classification and categorization of blogs and other UGC for better presentation patterns in search result. But definitely we can improvise context driven classification of contents and search upto a point where we can provide the users with structured information like www.wikipedia.com provides.

I looked into existing blog search engines like

1)Technorati
2)Google blog search
3)Blog pulse and
4) "Nielsen BuzzMetrics(www.nielsenbuzzmetrics.com)"

All these sites index blogs and provide search interfaces on them , when some of them has gone one step ahead in providing structured trend information out of the blog content. But looks like they have a long way to go.

Basically my idea is to come up with a very basic implementation that does the following

1)Crawl a set of blogs belonging to a very specific domain ( Ex. Restaurant or Movies )
2)Index them in the order of business entities they primarily talk about.
3)Present the information in a review oriented format on a brief , user friendly UI.