Mining Online Sentiment: Can Algorithms Alone Really Tag Blog Posts Accurately?

by Steve Broback on March 14, 2008

I’ve been spending a lot of time over the past few months researching companies in the “sentiment analysis” space. When we began developing our own process for categorizing/tagging blog posts with product and/or company affinity, we discovered that most monitoring systems take one of two approaches. They either take an algorithmic approach to text mining, or use a human tagging methodology.

Bottom line — have a computer “read” the text, or have humans do it.

I’m hearing conflicting reports about the pure algorithmic approach and its accuracy. Academic research largely attests that you can’t get much better than 80% accuracy when analyzing “unstructured” content. Others claim that the right algorithms can practically tell you a bloggers shoe size.

Our foray into this space started when a founder of one of the more prominent (and well funded) brand monitoring companies confided to me that their year-long initiative pursuing algorithmic sentiment detection was considered a failure due to achieving at best 80 percent accuracy.

Technical gurus at another well-funded and well known firm in this space confirmed in discussion the 80 percent figure for their algorithmic process.

Given their experiences, I wonder if most of these claims of highly accurate sentiment tagging using only algorithms is just PR spin.

Seth Grimes recently wrote an article on the subject that implies 80 percent is high on the scale:

“Text analytics/content management vendor Nstein reports that their Nsentiment annotator, ‘when trained with appropriate corpus, can achieve a precision and recall score between 60% to 70%.” These are good numbers when it comes to attitudinal information. Michelle DeHaaff, marketing VP at Attensity, says that “getting beyond sentiment to actionable information, to ’cause,’ is what our customers want. But first, you’ve got to get sentiment right.’”

We have developed a hybrid platform that provides human-level accuracy with the benefits of an automated environment. We’re doing exhaustive testing now, but we’re seeing accuracy way beyond 80 percent. Check it out here.

One company touting the algorithmic approach is SPSS. They work closely with Anderson Analytics who provides services in this space. It appears surveys are one of the main content sources they process — which seems like rather “structured” content to me. No doubt that boosts accuracy. Tom Anderson’s blog is here, and he discusses an upcoming webinar on the subject.

Relevant contributions on this subject can be found from bloggers Matthew Hurst, Stephen E. Arnold, Nathan Gilliatt, and Seth Grimes.

{ 2 comments… read them below or add one }

1 Tom H C Anderson 03.15.08 at 8:52 am

If you are working with longitudinal data, comparing month to month for instance, or comparing different products and brands then extremely accurate sentiment reading isn’t necessary as you are really looking for differences between groups. Additionally by considering the relationship between positive and negative sentiment in trended data (they tend to be positively correlated) when the correlation changes, in other words in one month for one brand you might see that negative sentiment increases while positive decreases, this signals a possible ‘event’ is occurring which needs to be drilled down into for further investigation.

However, for some of our clients in the past (such as Unilever), an extremely accurate level of sentiment was desired. Our methodology (AA-TextSM) relies on triangulation for validation, and we have sentiment accuracy in high nineties in most cases when applying this technique. Because most of our projects are ad-hoc in nature, the human factor is very important, so Anderson Analytics, more so than those companies focusing solely on a large volume of blog posts usually invest the time in perfecting custom dictionaries and understanding the special relationships between words in each project.

As you mention, many survey open ends are rather structures. On the other hand many are not. For instance if you ask a hotel guest to rate their overall satisfaction on a 10 point scale, then ask, why did you give this rating in an open ended question, you will get anything but structured answers. Our methodology has been used on other types of data as well though (call center logs, emails etc.).


2 Blog Business Summit » Automated Sentiment Detection Round 2: 80% Accuracy Confirmed for Blogs and Unstructured Content 03.15.08 at 5:29 pm

[...] have more data points relevant to yesterday’s post. Bottom line: Yes, you need non-trivial human involvement to go beyond 80 percent accuracy with [...]

Sponsored links

advertise here