NLP – short for Natural Language Processing – is a form of Artificial Intelligence (AI) that enables computers to understand and process the natural human language. NLP is at the core of many digital solutions we’re using every day. In this article, I’m going to tell you more about how it works and where it can be useful from the business perspective.
In the previous NLP entry, we already explained the basics of Natural Language Processing and talked about how it works in popular customer-faced solutions. Read it to get an understanding of NLP 101.
How NLP Is Used – Business Scenarios
Even if you’re not an NLP expert, you probably know that Natural Language Processing is at the heart of many software tools we’re using regularly – from search engines and spam filters to translation software, chatbots, or grammar correction software. But apart from that, NLP also provides strong business benefits to internal business operations. In a sense, Natural Language Processing can be both at the frontline, directly influencing customer experiences, and also operate in the background, without the client ever noticing.
Let’s look at a few examples how the latter may look like.
Analyzing Product Reviews Automatically
In the age of rising digitalization, user reviews have become a key currency in many industries. Let’s take eCommerce, for instance. It’s no mystery that internet opinions can make or break retail businesses. That’s why user reviews have to be taken seriously.
However, if we’re talking about big enterprises, reading and analyzing all the relevant internet opinions may be an impossible challenge. At the same time, if a business ignores their customers’ feedback, clients may feel ignored or view the store as untrustworthy. Not to mention that the e-shop won’t even be able to measure the overall customer satisfaction.
What’s more, not every internet opinion is relevant – so it’s not even worth reading. This is perfectly depicted by reviews which ratings and comments are clearly not matching.
Natural Language Processing is a handy tool to solve these issues. Since NLP is able to analyze huge chunks of textual information, it can process user reviews and deliver actionable insights. As a result, eCommerce executives will be able to make data-driven decisions swiftly, minimizing customer dissatisfaction and making clients feel respected.
Extracting Information from Business-Oriented Content
All business executives have to rely on textual information. This text includes emails, market analysis reports, or even press news. Sometimes the length of this written content can be overwhelming. Finally, the day only has 24 hours, and emails can be piling up indefinitely.
In this case, NLP has the potential to serve as an effective mechanism to extract useful information. With Natural Language Processing, business executives can get a summarized version of relevant texts, cutting the time needed to go through the raw versions. As a result, NLP can save up their time for more meaningful tasks and immensely improve their everyday operations.
Analyzing Market Trends
Depending on the business specifics, companies can end up receiving loads of data from sales departments, consultants, support centres, or even directly from the customers. Such data is mostly textual – as a result, it’s also a great NLP automation candidate.
However, differently than with the previous examples, Natural Language Processing doesn’t have to be limited to producing text summaries and insights. In this theoretical business scenario, a useful option would be to classify the textual information into meaningful topic clusters – for example into marketing mixes (4Ps), or simple internal classes.
All NLP Business Scenarios Have One Thing in Common
The above scenarios are only exemplary cases of automatic information extraction in business. Depending on the business need, the final automation goals may be completely different. But their main focus will always be identical: identifying and highlighting crucial information in an ocean of raw text.
And now I’m going to tell you more about the technical aspects behind this process.
NLP Techniques – How to Extract Information from Text
These are numerous techniques that empower NLP. All of them are a field of interest of Natural Language Understanding and are widely used in Natural Language Processing.
To give you a broad overview, I’m going to list the most popular ones, presenting how they might be used in specific situations.
Topic modelling provides information about the text’s topic (if that is unknown). Topic modelling is also sometimes called “text categorization”. The technique can be empowered by several algorithms. Out of them, LDA is among the most widely used.
LDA is an unsupervised algorithm. In other words, it detects topics without prior learning on examples. However, this also results in dividing the text into abstract topics. As a result, LDA is not always useful; the topics get numbered, and not named (since it’s impossible in unsupervised methods). This means that the result needs to be reviewed and topics need to be manually identified. Yet, this can be tricky, especially if you’re analyzing short texts; in that case, short text topic modelling techniques (like Dirichlet Mixture Model) bring better results. Unfortunately, some manual analysis is still required.
After topics are clustered, the defined topics need to be assigned to real groups. For instance, you can end up with 20 topics, and have 4 categories to accommodate them; you need to decide which topic belongs were manually. After building such a model, you can pass any new text through this model and automatically assign this text to one (or more) topics.
It’s also possible to use semi-supervised learning processes – where you usually anchor the model initially. This is possible if you know which words are most significant for a given topic (for instance, if your topic is “price” then the words “price”, “USD”, “lower”, “increase” might be significant).
Finally, you can use supervised learning processes. However, they usually require a lot of labelled examples. Several different ML algorithm types can help – for instance, neural networks.
Named Entity Recognition (NER)
NER enables to identify important entities mentioned in the text. It’s usually used to find company names, brands, country names, people’s names, or other important phrases. Typically, NER algorithms are pretrained and show results that are specific to the dataset they were trained on. As a result, some named entities will not be detected; the entity might have not been known or identified during training.
So, to make the algorithm work properly, you should train the existing model further. As a result, you will empower it to recognize and categorize entities properly – for instance, differentiate between actors’ and singers’ names. Pretrained models usually return some predefined categories; training on top of them enables you to manipulate the categories if you need to. What’s more, you can append your named entities – which are important in your business – to allow the model to find your entities as well.
You should also be ready for the text containing misspellings. Such errors tend to be challenging for statistical models. When training a model, you can implement certain methods to detect these misspellings, using some mathematical formulas – like Levenshtein distance. If you expect your texts to contain a lot of mistakes (user reviews?), such an implementation is essential.
Part-of-Speech Tagging (POS)
POS allows detecting which part of speech is a given word in the sentence. This is important in terms of the word meaning: for instance, if a word is an adjective, it may describe the named entity detected earlier. This description may be important in the context of a given sentence. Of course, not every POS will be interesting in terms of information extraction; that’s why POS is important, allowing us to focus on the valid parts of textual data.
You can, for instance, extract all nouns from a given text, defining the subjects and objects. As a result, you can quickly evaluate what is mentioned in the text without reading it fully. You can also pair adjectives with nouns, defining how the objects are perceived.
Syntactic parsing is connected with the ability to define how different words in the sentence are connected together. Let’s consider a simple sentence:
“Mark has bought beautiful apples”.
In this example, “Mark” is the subject, while the word “beautiful” describes the apples, and not Mark.
By only running POS tagging, we could end up with the impression that “beautiful” is a description of Mark. Syntactic parsing saves us from making such a mistake.
Moreover, some sentences are not clear enough for POS tagging. For example:
“Mark eats apples” or “Apples eat Mike” have the same POSs, but the sentences have completely different meanings, with the second one being absurd. Luckily, syntactic parsing is able to tell the real dependencies between words.
Semantic parsing identifies the real meaning of a phrase or sentence. Semantic parsing is usually connected with finding similarities between words (from different sentences). For example, semantic parsing is able to tell that the word “pizza” is close in meaning to “fast food” or “pasta”. This is usually possible by using word embeddings – i.e. methods of changing words/phrases to vectors and finding the distance between them.
Coreference resolution is a method of identifying different words that reference the same objects. For instance:
“Angela lives in Boston. She is happy there.”
“She” means “Angela”, while “there” translates to “Boston”.
Coreference resolution is usually based on pretrained neural networks. It’s able to return valuable information when it comes to long texts. Imagine that you are looking for sentences that describe your new brand. If you search for sentences that directly include your brand name (using Named Entity Recognition), you can easily omit sentences where it’s referenced by using a pronoun. Thus, coreference resolution extends your capability of finding useful information.
Relationship extraction is built on top of semantic parsing and enables identifying the relationships in a given text. For instance, we can have a text about someone’s marriage; relationship extraction algorithms allow us to get the information about who is married to whom. Relationship extraction is usually a complex algorithm operating on a large dataset.
Sentiment analysis is a powerful tool to detect the sentiment of a given sentence. You can obtain the information in many forms, but pure sentiment (negative, neutral, positive) or polarity (usually from -1 to 1, continuous range) are the most popular ones. Polarity provides more depth – for example, the polarities 0.65 and 0.98 both mean “positive sentiment”, but they’re clearly not identical.
However, analyzing sentiments tends to be more complex than it appears.
The same information can be positive or negative, depending on which entity it applies to. It all lies in the eye of the beholder. For example, if a sentiment is positive for your direct competition, it’s rather negative information from your perspective.
What’s more, sentences can have mixed meanings. The sentence “Company A provides their products to the customer, and they are much worse than Company B” has two entities and an overall negative sentiment. But clearly, this sentiment is not aimed at Company B. But ultimately, you can’t divide the above sentence into two sentences, because the overall meaning will be lost.
But even with these issues, sentiment analysis provides valuable insights into textual information. You can, for instance, validate the ratings in your e-store with the textual content of the comment section.
Text can be summarized automatically with NLP methods. Able of achieving this are multiple techniques (e.g. TF-IDF), providing relatively good results. Yet, they require quite large datasets and continuous text rather than simple, short comments.
There are plenty of techniques providing text summarization, including very sophisticated ones. Moreover, these methods can be used for both long and short texts, using different approaches. The two most widely used ways of summarizing text are called abstractive and extractive summarization. Extractive summarization tries to identify the most important parts of the text and provides a summary based on identified sentences. Abstractive summarization tries to interpret the text in a new way and delivers completely new content – assuming that this new content summarizes the original one.
NLP Techniques – Extraction Methods in Action
Most of the mentioned techniques can be combined to deliver the most precise insights. As an example, imagine the following set-up, which could aim to extract information from comments:
- Use NER to identify entities.
- Use POS tagging to find nouns, proper names, and adjectives (other entities may fall into this category as well).
- Use syntactic parsing to obtain relations between the named entities and their descriptors.
- Use sentiment analysis to identify the comment’s sentiment to associate the sentiment with named entities.
- Use topic modelling to associate the given comments to the predefined topics.
- Use text summarization either in a form of a text summarization algorithm or by taking out the most important phrases and their associations.
NLP – Benefits
The information that could be obtained through our exemplary setup can then be used in many different ways:
- Present the found named entities in a given set of sentences.
- Show the sentiment analysis connected with sentences with the given named entities (or the overall sentiment analysis distribution).
- Present the associations for the given phrase/word/named entity.
- Allow for saving the non-detected phrases as the phrases to be detected later (so – in other words – mark something that should be found in a later analysis).
- Showing the comments connected with a given named entity and sentiment.
- Connect found named entities in groups and allow to analyze their associations.
- Analyze trends – for instance, entities’ sentiment over time or a sudden drop in sentiment for a given entity (like your brand) or similar. This can be user-defined.
- Analyze incoming comments and detect the previously found correlations (e.g. your brand + a negative sentiment) and notify the interested parties. The correlations can be user-defined.
- Visualize the topics to present the most often used words inside the given topic, the sentiment distribution in the topic or the sentiment assigned with a given named entity in the topic. You can of course think of many other visualizations available to obtain from the extracted information.
And many more.
NLP – Tools
As described in our first article of the series, there are multiple tools able to perform the above analyses. You can use Cloud services (like Amazon Comprehend) or the available NLP libraries (like spaCy). Comprehend provides an easy-to-use API – but due to its closed form, it’s not fully customizable; however, it delivers a lot of information and enables retraining.
SpaCy, on the other hand, supports almost unlimited customization and provides a lot of tools and pipelines on top of itself, enabling additional analyses (this is also a way to avoid multiple cloud API requests for a given set of data). SpaCy provides a set of pre-trained models of great quality and enables large scale calculations.
Finally, it’s important to remember that specific tools themselves are not the key component. There are really a lot of them out there; the priority is to identify the processing relevant to reach the business goals.
Natural Language Processing – Final Thoughts
Remember that from the business perspective it’s crucial to plan how the data analysis should be conducted and how you’re planning to use the output. This includes the answer to the following questions:
- How the future machinery could be fed with data, what data sources should be planned, what constraints should be met by the data processing pipelines.
- What business role would consume the outputs and on what basis (as predicted values or business insights may support decision making)?
- What could be the end2end business process of using the data?
- What are the current expectations regarding the output data?
- How could the data be presented to deliver maximized business value?
Working on business scenarios is an essential part of the whole process – together with the actual data processing stream.
And if you'd like to take full advantage of tailored NLP solutions, the Innovation Lab is at your service.