5th Annual Text Analytics Summit
Over the last year, we have explored text analytics within the ForwaR&D lab from InSites Consulting. Text analytics can be defined as a set of linguistics and/or statistical techniques to extract concepts and patterns out of textual documents. In essence, it can be applied on all types of text. Within InSites Consulting, we have applied it so far on social media, open questions and research communities.
Except for a couple of Esomar papers or academic publications, I have not encountered text analytics too much on market research gatherings. I was therefore eager to share experiences with other users: How do they use text analytics to get new insights? What software do they use? How do they approach the research process? How do they sample? What visualizations are new? What about sentiment analysis? Therefore, I decided to subscribe for the 5th annual text analytics Summit.
So did I found what I was looking for? Here is a summary of my main take-home messages:
- Text analytics have many faces
80% of data are unstructured. It is therefore not surprising that text analytics appeals to many companies. Text mining is applicable on fraud detection and e-discovery (finding evidence for criminal cases). The technique is also often used in development of search engines. A nice example of this is wolfram alpha. In this search engine you do not need to enter keywords but you can directly ask complete questions. The engine at the back end analyses the sentence with aid of text analytics and tries to give an answer. Another application is online advertising: based on the key words you enter in search engines or your profile on for example social networks, you get customized online advertising. Other purposes of text analytics can be found in knowledge management and Business intelligence.
- The voice of the customer
Attempts to integrate text analytics in market research are mainly focused on two areas: Alike InSites Consulting, research agencies try to capture the buzz on social media in order to take spontaneous consumer feedback into account. Capturing the online buzz is however only the tip of the iceberg: customer services have call centers where they collect complaints, questions, etc. from their clients. Many companies hope that text analytics can help them to automate and get additional insights out of those texts.
- Part of speech: looking for the Holy Grail?
Text analytics is much more than only counting the words in text. It also needs to take linguistics into account: the software needs to understand stemming (e.g. spoke is a declension of speak), causality (e.g. tom hits Jan is different from Jan hits Tom), co-references (e.g. the boys are playing sports, they are very tired: they refers to the boys) and intensifiers (e.g. I am so dirty implies an amplification of dirty). Even when this is countered, we need to try to make the computer so intelligent that it can grasp sarcasm and irony.
Another challenge is sentiment analysis where we derive the attitude of customers about a certain brand or product indirectly through the emotions people mention in their texts. First of all, the extracted emotions are often limited to positive and negative. Secondly, from the moment there are several brands mentioned in a text, the computer needs to derive which emotion is related to which brand. Different algorithms try to provide an answer for all those challenges. A hybrid approach from different algorithms delivers also the best results.
It seems that human involvement will remain important to do the final check in case of doubt.
- You are being watched
Text analytics is only the beginning. The next wave of innovation in this area will be in other types of data. Speech mining is already on one’s way: telephones to customer services are recorded and with the aid of speech technology transmitted to texts. The next big thing is video analytics. First cases are done in retail where the traffic of customers through the shop is analyzed based on the image of security camera.
Back in the headquarters in Ghent and looking back at the conference, I think back on what I learnt over the last couple of days: whereas intelligent computer used to be something from science fiction movies, this summit has convinced me that it is becoming a reality. Although the applications of the new analysis technique are wide spread, my ‘research heart’ beats quicker when I think about the infinite possibilities it can bring to market research. We are however still at the early days where many questions remain to be solved: How do you take a good sample of social media sources? What is the profile of people posting online comments? How can we present these huge amounts of information in a managerial way? What are appropriate sample sizes to draw valid conclusions? Within The InSites ForwaR&D lab, we are trying to pin the methodology down. When I left to Boston, I hoped to discuss the methodology more in-depth with other users in the field. And although I saw interesting things in this field, the conference was a bit like a text analytics client project: you often find answers for questions you did not really ask. It was for sure an inspiring conference! I am sure the market research industry will address best practices in text analytics in the future. If somebody would like to already change ideas now, I would be happy to join the discussion.
For more information you can contact Annelies Verhaeghe, R&D Consultant (Annelies.Verhaeghe@insites.eu)