Built for Scale: Dovetail

So, Dovetail is Pypestream’s proprietary NLU, natural language understanding, engine.

Dovetail encompasses Pypestream’s artificial intelligence capability, so it brings into the fold our machine learning and natural understanding capabilities which essentially means it’s able to translate human input into machine language that our platform can understand and translate that back into an adequate response to users.

Dovetail brings together multiple AI technologies to help conversationalize a company’s customer service from a simple question, to an action, to a deep FAQ question.

So, we break Dovetail into a couple different components there’s a classifier model upfront to understand a user’s utterance, so their intent, and the entity within their statement, and then there is also a search, basically an intelligent search, on the back end as well.

If you ask me a question I am able to understand what that question is and apply it to a particular classifier in my life.

Dovetail is a three layered system, layer one is really focused on transactions, so it’s for basic question, answer, question transaction, I want to pay my bill, I need to do XYZ. Layer two is very focused on trying to answer just a broad generic question, you know, can I get service on my RV or, you know, what packages offer HBO. And then layer three is bringing those two together in a conversation, so let’s say you’re in a conversation about picking a new package, you go to the system and you say ‘hey I want to buy a new cable service’, and you try to pick a package and it shows you your options, it’s taking you through that transactional flow, and then you ask just a random question, ‘do you guys have a voice remote’, right? Dovetail is there to answer that question but bring you back into the conversational flow, similar to a conversation with an agent.

So, what’s unique about Dovetail is the fact that we’re using all these technologies together to help augment customer service, right. If you look at a lot of our competitors out there they don’t do this, they use the single technology, and then you fail and have to talk to an agent. what we try to do is we try to answer as many questions as we can with the AI to reduce the number of questions that go to agents.

Dovetail is very unique in the marketplace when you look at what other vendors, what other companies are doing in the NLU space, learning how to understand a user’s utterance, they’re focusing a lot on classifiers. Looking at things like random forests, SVM, other common ways of classifying a users utterance. We really have taken a step beyond that and incorporated predictive or NLU based search.

What makes the Dovetail unique is that it separates those layers in a way that is accessible to the person designing the solution.

So combining those two aspects into Dovetail really allows for a much more powerful, a much richer response, and much more interactive back-and-forth between the user, anytime they’re typing a question in or anytime they’re typing a statement in, and it’s performed very well.

Data that we use to train our Dovetail framework comes from a couple different sources. So typically, best-case scenario, we’ll work with our customers to use real data to train both the classifier and the search NLU components of Dovetail. So typically that means historical chat transcripts, so content that users might have typed in the past when they’re interacting with a live agent, emails can be a good source of content as well, anything that we can refer to where users will have asked the types of questions
that we want Dovetail to be able to answer, that’s the type of data we want. Now, if our customers, which they often may not have this data, then we will actually use open source services, we’ll crowdsource information and actually go out and get really quality utterances, things that tie directly back to the use cases that are in scope and the areas that we’re looking to automate using Dovetail.

Some are Wikipedia articles, the way people write, and what that helps us do is understand basic English in the way people write verses just using scholarly articles. We also use industry standard articles like Bloomberg for, let’s say, a financial services use case, and what that does is that helps us understand jargon of that that industry. So the historical data that we procure, or we source from the client, we use that to build trend lines into, train it within our platform, train our support vector models or our logistic regression models to be able to understand, in the future, when a customer comes in or a user comes in and asks a question where can we intelligently apply those intents, intelligently apply that confidence and, you know, predict exactly what the user is asking and being able to intelligently respond to that.

And really the training processes is quite straightforward, we take the utterances that we source, either from our customers or from external data sources, and classify those. So, we’ll use topic modeling, so open source technology to look for key concepts, key terms within the utterances which helps our teams understand what are those key concepts, or those key areas, that really these utterances comprise. We’ll go through that process and ultimately get to a point where we have a robust, trained NLU model that performs well, achieves, you know, high accuracy scores, high precision scores when users type in questions, so then hopefully provide the right response back.
Over time we will go through and do optimization so look at usage, do some supervised machine learning, so go back and look at where we can improve the chat accuracy, look at specific responses that might have failed and that’s how we then retrain the model over time.

On the classifier side of things we use multiple classifiers and, actually, will prioritize which classifier we use based on the confidence score achieved with each classifier. So, when a user types in an utterance we’ll actually test against six different classifiers and then point back to which one we believe provides the best answer based on the confidence score achieved.

Now when it comes to answering the frequently asked questions that’s where it gets a little more complex because you end up having multiple questions that are very similar to each other, maybe have similar meanings, and so for that we’ve employed some more advanced techniques like deep learning and word embeddings that allow the frequently asked question classifier to really dig into the meaning of the question, deal with things like word synonyms and things like that, to try and deal with those
ambiguous matches much more effectively.

When we are able to bucket a question or a piece of data into a classification, or a class, we’re then able to dig into that a little bit deeper and understand what entities or key words or key concepts are being asked within that data, and we’re able to marry both of our intents, or our classes, with any keywords to be able to understand truly what that user is asking and apply an answer to it.

So, Dovetail uses classification to answer the simple questions, search is really meant to take the users non-repeatable questions, questions that may come up very few times, and go get an answer for that. So think of it as, you know, a needle in a haystack is where search is used, right, kind of like a Hail Mary pass to go find the answer based off what we know. And then classification is used for the things that come in every day, hundreds, thousands of times a minute, right? That’s where classification really excels, they’re highly repeatable, they drive your business value, the search is used to augment that business value.

This is the ability to use relationships between words, these are the abilities to use a more intelligent, a different way of looking at the content that’s being asked and being able to still target that 20% outside of the 80% and still being able to intelligently answer those questions.

The benefits of Dovetail’s approach are that it helps drive more responses, it can answer more questions, it can drive a user to close a transaction without having to transfer it to an agent.

Not having to provide label training data for any of the frequently asked questions is also a really big benefit, because when you’re dealing with 25, 50, 100 different FAQ’s then that can just turn into an exponentially larger amount of training data you have to provide to train a conventional classifier like that.

This is especially powerful when we look at transactional flows, so paying a bill, changing settings on a device, or troubleshooting a device, all these types of things the NLU side of things with the guided user flow combined, which really are part of the customer experience that we build at Pypestream, are very powerful way to build out these conversational AI solutions.

Dovetail has many patents on top of it, one is a way we orchestrate our classification, one is a way our search works, and then the method of tying those two together.

At its core Dovetail is the heart of all of the solutions that we deploy for our customers. It brings in the natural language understanding capabilities, the artificial intelligence capabilities, that allow us to really drive value within the solution.

The biggest value, again, with Dovetail is the ability to ask a question, make a statement, immediately get that answer, and then route the user into that transactional flow.

Dovetail takes in those inputs from their various customers, it allows our customers to understand what the issues that their customers are asking about, and intelligently apply answers to those customers without ever hitting an agent.

So, NLU combined with, again, a guided user journey, the power of Dovetail to get the user to the right place very quickly once they start interacting with the conversational AI solution, that’s where we’ve seen the most value.

Leave a Reply