Our Secret Sauce with Intent Analysis

Interpreting the intent behind spoken or written language is a very complex process that humans learn through experience in different social settings. This is one reason why literature is studied in universities around the world and academics earn doctoral degrees distilling the various nuances of the works of great literary figures.

3Pillar recently teamed up with one of our customers that promises disruption in the way businesses understand competitor strategies and respond to them. The customer has invested years of research and thought leadership in dissecting discourse from a wide variety of industries. The outcome of this research is a model for classifying discourse as a particular strategy that you may be battling or playing. The model further suggests your counter strategies or assistive strategies.

The key challenge in building an automated decision system based on this model is that the computational linguistics involved with intent analysis are not well developed or understood today. The customer asked 3Pillar to build a beta version of the product to understand the possibilities and the challenges. When the 3Pillar team analyzed the model, it realized that many components of the model could be realized with Natural Language Processing (NLP). NLP in and of itself is not the solution to intent analysis, but starting from an ontology or a model of human interaction, NLP should be the next stop.

Screenshot 2016-02-09 23.36.09

We buzzed around armed with our Python REPLs (this is not a word yet, but it should be!) and promptly imported the venerable NLTK library and its corpora. The most useful components of the library were as follows:

  • Tokenization – Tokenizing text found on the web is a notoriously difficult activity. There is an entire ecosystem busy with scraping web pages with automated and manual processes. NLTK made it simple to tokenize content in two steps – first to sentences and next to words, and we did not need to bother ourselves with various sentence and word delimiters.
  • Pronunciation – The CMU Pronouncing Dictionary contains about 150,000 words from North American English and their pronunciations. Phonetic analysis of words in a sentence allowed us to find patterns with alliteration and rhyming words.
  • Part of Speech Tagging – We used the Average Perceptron tagger with the UPENN tagset to understand the structure of sentences. Understanding if a sentence began with a verb, if it had cardinal numbers or if it had superlatives were essential for the decision system.

Our choice of Python ultimately enabled us to deliver the decision system prototype complete with an API built with the Flask microframework and an automated installation procedure.

One more successful product and a happy customer. Interested in finding out more? Contact us to hear the rest of the story!

Sayantam Dey

Sayantam Dey

Senior Director Engineering

Sayantam Dey is the Senior Director of Engineering at 3Pillar Global, working out of our office in Noida, India. He has been with 3Pillar for ten years, delivering enterprise products and building frameworks for accelerated software development and testing in various technologies. His current areas of interest are data analytics, messaging systems and cloud services. He has authored the ‘Spring Integration AWS’ open source project and contributes to other open source projects such as SocialAuth and SocialAuth Android.

Leave a Reply

Related Posts

How to Use Selenium in Continuous Integration Testing Selenium provides web developers with a suite of tools to automate web browsers across many platforms. At 3Pillar, we use Selenium in conjunction with...
Accessibility Testing Tools and Techniques Why Accessibility Testing Matters    With each passing day, the web is assuming greater significance in our lives, be it e-commerce, e-payments, In...
Jessica Hall and Kenal Shah Speak at ProductCamp DC Jessica Hall and Kenal Shah gave a presentation titled "Using Prototypes to Validate Product Strategy" at ProductCamp DC, a premier product-focused ev...
Security Vulnerabilities in Java-based Web Applications With the proliferation of Web 2.0, the frequent usage of networks makes web applications vulnerable to a variety of threats. According to a survey by ...
Abstracting Selenium Tests using Page Object Model Page Object Model is one of the most widely used design patterns by the Selenium Webdriver community across the world. In the initial days of function...

SUBSCRIBE TODAY


Sign up today to receive our monthly product development tips newsletter.