Conversational AI Glossary

The conversational AI world is full of highly technical jargon that can be confusing for even seasoned IT professionals. To help you navigate through these terms, we have put together this conversational AI glossary to help clarify relevant terms.

conversational-ai-glossary-hero-image

A

Agent Assist

Agent assist is a strategy that uses an artificial intelligence bot to help human agents efficiently resolve customer ques...

Agent Handover

Agent Handover is the process by which an agent- assist tool hands off a conversation from a bot to a human agent. Typical...

Amazon Connect

Amazon Connect is a cloud-based contact center service that was launched in 2017 as an Amazon Web Services (AWS) product. ...

AudioCodes

AudioCodes is a global company that specializes in voice technologies. AudioCodes operates in over 100 countries and is a ...

Automated Speech Recognition (ASR)

Automated Speech recognition (ASR) is the process by which machines recognize spoken human language. The process involves ...

Avaya

Avaya is a global company that specializes in communication technologies, specifically contact centers, unified communicat...

Average Handle Time (AHT)

Average handle time (AHT) is a metric that service centers use to measure the average amount of time agents spend on each ...

Auto dialer

An auto dialer is an automation system and its primary function is to automatically dial phone numbers. When the call is r...

Automatic call distributor

Primary functions of an ACD or Automatic Call Distributor include receiving incoming calls and thereafter distributing the...

icon-close

Agent Assist

Agent assist is a strategy that uses an artificial intelligence bot to help human agents efficiently resolve customer questions and concerns. Agent assist is easy to integrate with an existing customer service support system; when properly utilized, agent assist can result in significant cost savings, increased agent productivity, and increased customer satisfaction. 

Agent assist, also known as agent support, provides agents with the information they need to resolve customer requests quickly and consistently. When a customer begins a live chat with an agent, the agent assist bot can monitor the conversation, recognize customer questions, and suggest answers to common questions from a specified template or information base. These features enable many common service requests such as account management and order tracking to be almost completely automated, which allows agents to process requests more quickly and focus on more complex issues that require personalized assistance.  

Agent assist is also invaluable for training agents. The tool helps agents get familiar with new products and services quickly, and it ensures that routine questions are accurately answered. Agent assist helps businesses seamlessly transition between agents and ensures that customer satisfaction is not disrupted in the process. Streamlined agent training, efficient use of resources, and increased customer satisfaction make agent assist a powerful tool to increase business profitability and enable scalability. 

Agent Handover

Agent Handover is the process by which an agent- assist tool hands off a conversation from a bot to a human agent. Typically,the agent handover process is designed to ensure that conversations are handed off in certain scenarios related to user preference, user feedback, and issue complexity/criticality.

User preference and feedback are crucial variables to consider in order to maintain customer satisfaction. If a user asks for a human agent or expresses frustration, the agent handover process should be initiated. Similarly, if the bot is unable to resolve an issue or is faced with a high-stakes issue, the issue should be handed off.

For the agent handover process to be effective, the bot must be able to recognize its limitations and be intelligent enough to identify situations that require handoff. One way of achieving this is to train the bot to recognize key words phrases , or patterns that should trigger a handoff.In addition to training the bot, it is also common to include a user-driven handoff option after each message (e.g. a “chat with agent” button).

Amazon Connect

Amazon Connect is a cloud-based contact center service that was launched in 2017 as an Amazon Web Services (AWS) product. Amazon Connect is designed for omnichannel use cases and is based on the same technologies that Amazon uses for its customer service. The service uses a graphical user interface, which enables non-technical users to easily set up and manage it.

Amazon Connect works for organizations of any size and can easily scale up or down to meet short-term business demands. Amazon Connect can support up to tens of thousands of agents and uses a pay-as-you-go model. It enables businesses to set up contact centers with just a few clicks and allows third party applications such as Cognigy.AI to be seamlessly integrated via its comprehensive API features.

AudioCodes

AudioCodes is a global company that specializes in voice technologies. AudioCodes operates in over 100 countries and is a vendor for 50/100 Fortune 100 companies. Products and services offered by AudioCodes include IP phones, media gateways, routing applications, session border controllers, and more. AudioCodes technologies are used in many popular applications such as Microsoft Teams and Skype for Business.

In 2018, AudioCodes released Voice.AI Gateway, which utilizes the company’s speech recognition technology, call recording, and artificial intelligence. Its cognitive voice-based applications can integrate with private and/or public voice networks and services.

Cognigy and AudioCodes have partnered to offer Voice.AI gateway as a conversation management solution to facilitate intelligent voice conversations, handle service transactions and provide detailed analytics to streamline business processes.

Automated Speech Recognition (ASR)

Automated Speech recognition (ASR) is the process by which machines recognize spoken human language. The process involves using algorithms to translate human speech into a sequence of text that the machine can understand. High performing ASR is a key feature for any technology that aims to enable voice-based communication between humans and machines.

Automated Speech recognition has a wide range of applications that span across various industries; many people utilize ASR daily. Voice prompted customer support lines, voice command systems in cars, voice activated smart home devices are among the most familiar technologies that rely on ASR. However, ASR also has many lesser-known applications including automatic language translation, automatic subtitle generation for the hearing impaired, and others.

Avaya

Avaya is a global company that specializes in communication technologies, specifically contact centers, unified communications, and related services. Avaya is the global leader for these services; more than 90% of the largest US companies are Avaya customers. Avaya strives to take business communications to the next level through technologies that are built to connect organizations to their employees, customers, and communities.

One of Avaya’s well-known products in the Avaya Oceana Solution. Oceana is a contact center that enables organizations to interact with customers across all types of channels, including but not limited to email, mobile, web, social media, voice, and video. Oceana includes an analytics framework, browser-based desktop client, and features that enable users to build specialized clients and visual process workflows.

Cognigy.AI seamlessly integrates with the Avaya technology stack and enables contact center automation through deploying powerful virtual agents based on conversational AI.

Average Handle Time (AHT)

Average handle time (AHT) is a metric that service centers use to measure the average amount of time agents spend on each transaction. AHT is calculated for a given time period by adding the total talk time and total post-call tasks and dividing the sum by the total number of calls. AHT may also be used as a metric for other service activities, such as emails or chat support (but may be calculated slightly differently depending on the task).

AHT is one of the most important performance indicators for a service center. While a low AHT is desirable, it is important for businesses to focus on the right variables to lower AHT. If a goal is set to minimize AHT in general, it often results in agent behavior that causes decreases in customer satisfaction, such as rushing callers or providing mediocre solutions that result in repeat calls. Instead, more specific goals should be set around improving agent knowledge and performance, which organically results in decreased AHT. For example, organizations should prioritize agent training, creation of shared knowledge bases, and investment in tools that can streamline support. Conversational AI can be a key component to reduce AHT without sacrificing customer satisfaction.

Auto dialer

An auto dialer is an automation system and its primary function is to automatically dial phone numbers. When the call is received, the autodialer can either connect it to a live agent, play a pre-recorded message, or apply a Conversational AI for outbound communication handline.

It is also often referred to as a power dialer or a predictive dialer according to the features it includes. While a power dialer would simply dial a pre-configured amount of lines once an agent finishes a call, a predictive dialer uses real-time analysis for evaluating the optimum time period required for dialing more numbers.

Automatic call distributor

Primary functions of an ACD or Automatic Call Distributor include receiving incoming calls and thereafter distributing them amongst a particular group of agents, a particular team, or an IVR menu based on preset conditions. ACDs are the very essence of call centers. IVR [link to glossary entry] (Interactive Voice Response) is a subset of an ACD system.

ACDs help in situations where call traffic is high, agents are offline, when there are inbound calls post business hours, and for automation of responses for FAQs.

Newest ACDs also make use of natural language IVR and Conversational AI functionalities to automatically handle incoming calls and try to resolve these with an automated voice conversation.

B

Business Process Management (BPM)

Business process management (BPM) is a method by which organizations create, maintain, and update their processes. The goa...

Blended Agent

There are two types of calls at a contact center- inbound and outbound, and there are separate agents assigned for handlin...

Bot Framework

Bots enhance customer interaction by giving a near-human or an intelligent-machine experience to customers. A bot framewor...

Barge-in

Barge-in is a feature of Cognigy Voice Gateway which is activated when a user ‘barges in’ or interrupts a bot’s message by...

icon-close

Business Process Management (BPM)

Business process management (BPM) is a method by which organizations create, maintain, and update their processes. The goal of BPM is to output efficient processes that can evolve to meet business needs and market demands.

BPM consists of several cyclical phases. First, a process must be designed and modeled; the process should be broken into discrete tasks and put into a visual framework that identifies required data and how the tasks relate to each other (e.g. a flowchart). The process should then be implemented, preferably on a small scale at first to work out any process issues. Once a process has been fully rolled out, it should be monitored for performance by using metrics to measure quality, efficiency, bottlenecks, etc. Gathered metrics can then be used to further optimize the process. Optimization may involve incorporating tools or process automation, often powered by conversational AI.

Benefits of BPM include cost optimization, process efficiency and scalability, and increased productivity. It is an ideal management strategy for agile companies who want to constantly improve their processes and products.

Blended Agent

There are two types of calls at a contact center- inbound and outbound, and there are separate agents assigned for handling each. Sometimes, when the workflow is tilted to one side, agents who have been trained to handle both types of calls are switched to handle extra workload on the other side. These agents are known as blended agents. They help in reducing queues and increasing both customer and agent satisfaction.

Bot Framework

Bots enhance customer interaction by giving a near-human or an intelligent-machine experience to customers. A bot framework is a system within which bots are developed with defined behaviors. It is a set of assets that can help developers code more efficiently and create bots faster. A conversational platform expands on the abilities that a bot framework provides. A framework only offers the tools to configure or program a bot, but a fully featured conversational AI platform offers a holistic feature set to manage, run and deploy chat and voice bots at scale.

Barge-in

Barge-in is a feature of Cognigy Voice Gateway which is activated when a user ‘barges in’ or interrupts a bot’s message by speaking. Once this happens, the bot’s response is terminated and it starts listening to the user’s speech. The minimum word count required for a user’s speech to be detected as a barge-in can be configured. Below this word count, the user’s speech cannot activate the barge-in feature.

By default, the barge-in feature is disabled and the Cognigy Voice Gateway ignores a user’s speech input and considers only speech uttered after the bot’s playback is over. When this feature is enabled, user speech detection is constantly turned on. Once the user starts to speak and interrupt the bot, the bot immediately stops the playback, detects the user’s speech, and does away with its queue of additional text messages. This feature is administrator-controlled, but the bot can also control it dynamically in the course of the conversation.

The flexibility to enable and disable the barge-in feature provides the user flexibility to utilize the bot's playback in the best manner suited to specific uses.

C

Chatbot

A chatbot is a software application that enables machines to communicate with humans in written natural language. A well-d...

Chatbot Platform

A chatbot platform is a software tool to create, publish and maintain Conversational AIs. It provides a central place to p...

Cloud-Native

Cloud-native is a broadly used term describing applications optimized for cloud environments and the software development ...

Contact Center

A Contact Center is an office designed to receive and transmit various types of communication such as calls, emails, socia...

Conversational AI

Conversational AI is a branch of artificial intelligence that utilizes software and technologies such as natural language ...

Conversational IVR

A conversational IVR (Interactive Voice Response) is an advanced, more personalized version of a traditional IVR. A tradit...

Cognitive Services / Cognitive computing

Cognitive computing (CC) is what enhances human-machine interaction through a key factor that divides the two- context. It...

Custom Speech Models

Custom speech models are solutions to the issues faced by speech-to-text services with regards to industry-specific vocabu...

Custom Voices

Custom voices can become a part of a brand's identity. It uses Speech Synthesis Markup Language and the like but also allo...

Call Classification

Call classification is the process through which incoming calls at a contact center are classified in order to further mat...

Call Transcripts

Call transcription is the conversion of a voice or voice component in a video call into plain text. It can be deployed for...

Cognitive Services Component

Cognitive Services Component is the meeting point of the STT engine, TTS engine and the bot framework. It brings together ...

Call Transfer

Sometimes incoming calls at contact centers with chatbots need to be transferred to a human agent, mostly when the bot nee...

Contact Center Agent

A contact center agent/ call center agent/ customer service agent is someone who takes care of inbound and outbound calls ...

Continuous ASR

ASR (automated speech recognition) is a technology that translates speech into written text. It is used in automated phone...

Conversation Transcription

"Conversation transcription is a combination of speech recognition [link to glossary entry], speaker identification, and d...

Contact Center AI (CCAI)

"Contact Center AI (CCAI) describes a new category of AI software for automated conversations in contact centers. Instead ...

icon-close

Chatbot

A chatbot is a software application that enables machines to communicate with humans in written natural language. A well-designed chatbot "understands" human communication and can respond appropriately. Machine learning can be used to make bots handle more complex applications that require the chatbot to understand the nuances of human conversation.

Studies have shown that consumers increasingly prefer to communicate via messaging applications, and many expect to be able to communicate with businesses on a messaging platform.

Businesses have much to gain from using chatbots. Chatbots allow businesses to engage with multiple customers simultaneously without requiring valuable human resources, which results in cost savings, increased efficiency, and scalability. Chatbots also have the potential to improve customer experience and satisfaction by quickly resolving issues and streamlining communication with the business.

Chatbot Platform

A chatbot platform is a software tool to create, publish and maintain Conversational AIs. It provides a central place to power and orchestrate a workforce of chat or voice bots.

Many enterprise organizations decide for a chatbot platform strategy to avoid siloed initiatives around Conversational AIs across departments. This enables more efficient development and maintenance, better governance, synergies between use cases, better scaling, better compliance & data protection and more.

Cloud-Native

Cloud-native is a broadly used term describing applications optimized for cloud environments and the software development approach by which those applications are designed. The defining feature of cloud-native applications is how they are created and deployed. Cloud-based applications are typically created using a microservices approach and deployed in containers using open source software stacks. The microservices approach results in applications that are comprised of small, independent, loosely coupled services.

Cloud-native applications have a significant edge over traditional applications because they are flexible, scalable, and designed to work within an agile framework. Developers can easily update cloud-native applications based on changing business needs and market demands. System downtime is minimized, and product time-to-market is optimized, resulting in an improved user experience.

Software that is designed cloud-native is not necessarily cloud / SaaS offerings. . Cloud-native applications can also be operated on-premises or in private cloud environments providing similar advantages in up-time, scalability and other metrics.

Contact Center

A Contact Center is an office designed to receive and transmit various types of communication such as calls, emails, social media, letters, live web-based chat, etc. Contact Center operations may be inbound outbound, or both; inbound Contact centers typically handle customer service issues, while outbound contact centers typically handle marketing and data collection.

A Contact center is a crucial piece of infrastructure for any large company that routinely handles customer service requests. Having a centralized, designated office to manage customer interactions streamlines customer service efforts and often results in improved customer outreach and quicker resolution of customer concerns. Technology for Contact Center Automation and deployment of voice bots can increase contact center efficiency and help providing customers a frictionless service experience.

In recent years, technology has allowed the creation of virtual, cloud-based Contact center. In this model, a business opts to pay a vendor to host the equipment instead of having a centralized office; agents connect to the equipment remotely. Virtual contact centers allow employees to work remotely, which can result in cost savings for the business and greater staffing flexibility.

Conversational AI

Conversational AI is a branch of artificial intelligence that utilizes software and technologies such as natural language processing, machine learning, and automatic speech recognition to facilitate communication between a human and a machine. The goal of conversational AI is to mimic human conversation; to effectively do this, the AI must sound natural and be capable of responding rapidly and intelligently. A high-quality conversational AI should be able to offer responses that are indistinguishable from human responses.

Conversational AIs come is many different forms such as chatbots, messaging apps, and digital assistants. Most people interact with conversational AIs in their daily lives; Google Home, Amazon Alexa, and customer service applications are all types of conversational AI.

Many businesses have recognized the potential for conversational AI to revolutionize the way they interact with their customers. A well-designed conversational AI can provide a personalized user experience and result in significant cost savings for a business over time. Airline carriers, retailers, healthcare providers, and financial institutions are just a few examples of sectors that use conversational AI to help resolve consumer problems and automate customer support.

Many studies predict that conversational AI will become increasingly important in upcoming years. Conversational AI platforms are often seen as easier and faster than in-person communication and phone calls. Younger generations seem to favor conversational AI, and many consumers now expect to be able to communicate with businesses via chat platforms and their preferred messaging apps such as WhatsApp or Facebook Messenger.

Conversational IVR

A conversational IVR (Interactive Voice Response) is an advanced, more personalized version of a traditional IVR. A traditional IVR prompts the customer to push buttons and select various service options on helplines. On the other hand, Conversational IVR is a speech technology-based automated assistant that can speak with customers and help them in choosing options. It has the ability to collect data, learn customer behaviour, and deliver insights, bridging the gap between humans and AI. It uses the following technologies:

  • - Speech-to-text technology
  • - Cognitive services (A set of problem-solving Machine Learning algorithms)
  • - Machine Learning (An AI application that enables machines to learn and improvise automatically)
  • - Natural Language Processing (AI-powered process that helps computers learn linguistics)
  • - Natural Language Understanding (A subset of Natural Language Processing which deals with machine reading comprehension)

Cognitive Services / Cognitive computing

Cognitive computing (CC) is what enhances human-machine interaction through a key factor that divides the two- context. It has the ability to comprehend ambiguity and fluidity of human problems, and help them with finding solutions through synthesis of information, influences, and insights. CC includes a vast range of services such as language understanding, real time translation, text analysis, text summarization, and speech services (STT, TTS, speech insights, Speaker Recognition, sentiment analysis, recommendation engines, spell checker).

Reference - https://en.wikipedia.org/wiki/Cognitive_computing

Custom Speech Models

Custom speech models are solutions to the issues faced by speech-to-text services with regards to industry-specific vocabulary, background noise, varying speech styles, etc. In a custom speech model, users can upload training data such as related speech samples along with the input speech that one would provide to an STT. This aids in the improvement of overall speech recognition quality, enhancement of accuracy, and improvement in performance evaluation mechanisms.

Custom Voices

Custom voices can become a part of a brand's identity. It uses Speech Synthesis Markup Language and the like but also allows users to tinker with nuances such as pitch, rate, pauses, pronunciation, intonations, etc. to build a unique voice with near-to-human qualities. It can help brands respond better to customers and build an emotional relationship with them.

Call Classification

Call classification is the process through which incoming calls at a contact center are classified in order to further match them to the right agent. In standard contact centers, this is usually done in three ways.

  • - IVR selection [link to glossary entry for IVR]
  • - Providing callers with specific numbers/extensions for specific request-related departments
  • - By capturing caller data through IVR and then classifying the call through caller records

If call classification is too narrow, then there will be too many queues with a few agents in each queue. If the call classification is too broad, specialization becomes difficult on the end of the agent and calls might take longer to resolve. Thus, it is important to determine the right level of classification.

Call Transcripts

Call transcription is the conversion of a voice or voice component in a video call into plain text. It can be deployed for both live calls and recorded calls which are stored as Call Detail Records (CDRs) in the local database.

Cognitive Services Component

Cognitive Services Component is the meeting point of the STT engine, TTS engine and the bot framework. It brings together multiple third-party, cloud-based chatbots on a single platform, with bot frameworks, STT (speech-to-text) and TTS (text-to-speech) engines. All of these chatbot services can be used on the dialled number, with integration and support through HTTP-based APIs (Application Programming Interfaces). These APIs are enabled to convert the BOT framework API into an SIP event and vice versa. Streamlining is the key essence of the Cognitive Services Component.

Call Transfer

Sometimes incoming calls at contact centers with chatbots need to be transferred to a human agent, mostly when the bot needs extra help or when the caller specially requests for a human agent. With the Call Transfer feature, such calls can be easily transferred to a different bot or a live agent via Bot Developers. The Developers carry out transfer activities through APIs (Application Programming Interfaces) of the bot framework. Transfer activities of this nature have attributes such as an administrator-configured transfer target URI (it defines the new call destination) and textual description of the need for transfer (for Call Detail Records and logging).

Contact Center Agent

A contact center agent/ call center agent/ customer service agent is someone who takes care of inbound and outbound calls at a call center. Their responsibilities include answering inbound calls and resolving queries, making outbound calls for surveys and appointments, conducting research required for problem resolution, escalating queries when required, reporting to team leaders and higher management staff, and following up on calls where it's required.

Continuous ASR

ASR (automated speech recognition) is a technology that translates speech into written text. It is used in automated phone calls and Conversational IVR applications, where its output helps Conversational AI to comprehend and automate conversations. Continous ASR is an extention of this and transcribes voice input into text in real-time, with the Conversational AI working continuously in the background to transcribes user's voice inputs. Users can stop the Conversational AI at any point in time during the speech.

Conversation Transcription

"Conversation transcription is a combination of speech recognition [link to glossary entry], speaker identification, and diarization (attribution of each sentence to its speaker, determining who said what and when). It is a speech-to-text (STT) [link to glossary entry] solution that provides a real-time and asynchronous transcription of a conversation. It has major applications in personal meetings because of the capability to distinguish speakers. With the help of conversation transcription, developers can add STT to their applications for multi-speaker diarization.

When applied to Conversational AI, a transcript of the bot to human conversation can be stored including additional meta information and sentence structure. This is particulary important for applying Conversational AI to regulated industries that have certain reglementations to keep dedicated notes on a consultation."

Contact Center AI (CCAI)

"Contact Center AI (CCAI) describes a new category of AI software for automated conversations in contact centers. Instead of a human agent, the AI tries to handle incoming requests (chat or voice) with its own communication and problem solving capabilities and hands over a conversation to a human agent wherever needed. AI also has other applications at call centers.

AI-powered chatbots initiate conversation with customers, collect background information, and attempt to resolve their problems. If the issue gets more complicated, the AI bot passes information on to a human agent who can then take it forward seamlessly.

Intelligent call routing enables call centers to route calls based on customer personality, history, nature of problem and matches them with the agent best equipped to handle the type of call. AI can also help organizations with analysis of customer behavior, enabling sales and marketing teams to make customized promotional offers.

Contact Center AI has great potential to decrease time spent on interactions, improve agent efficiency, and help them in streamlining and improving services."

D

Deep Learning

Deep Learning is a form of machine learning that utilizes artificial neural networks.Deep learning algorithms have one or ...

Digital Wait Treatment

Digital wait treatment is when Conversational AI is applied to hold the caller while they wait to be transferred to a live...

DTMF Event

"In DTMF or Dual-Tone Multi-Frequency system, keypad buttons are represented by audible tones. DTMF can signal numbers fro...

icon-close

Deep Learning

Deep Learning is a form of machine learning that utilizes artificial neural networks.Deep learning algorithms have one or more intermediate layers of neurons inspired by signal processing patterns in biological brains. For example, a well-known application of machine/deep learning is image recognition. Here, a typical deep neural network would learn to recognize basic patterns such as edges, shapes or shades in lower levels of the network from unstructured raw image data. Higher layers subsequently capture increasingly complex patterns in order to allow the network to label complex features such as a human face or physical objects in an image successfully. A traditional machine learning model would rely on human-labeled images to learn.

Deep learning has many applications, including but not limited to self-driving technology, identification of security threats, medical research, object detection, damage prediction in oil and gas operations, and industrial automation. The potential uses of deep learning are endless, and as such it has become a hot topic in recent years.

Digital Wait Treatment

Digital wait treatment is when Conversational AI is applied to hold the caller while they wait to be transferred to a live agent. It can be used to keep a caller entertained, attempt upselling/softcrossing or attempt to solve the caller's query during the wait time itself. In this process, the Conversational AI [link to glossary entry for Conversational AI] gauges, learns, and applies entertainment, selling, and resolving techniques.

DTMF Event

"In DTMF or Dual-Tone Multi-Frequency system, keypad buttons are represented by audible tones. DTMF can signal numbers from 0-9 as well as letters from A to D along with the symbols * and #. Computers use DTMF for dialing a modem or sending teleconferencing commands to a menu system.

In Conversational IVR applications, the ability to detect DTMF tones can be combined with natural language Conversational AI features to enrich communication with the caller where the caller can be either prompted to enter a number or use voice to state their request."

E

Enterprise-Grade

Enterprise-grade (sometimes referred to as enterprise-readiness) is an umbrella term that describes a set of features and ...

Escallate Voice Interaction

Escalation of a query to voice interaction is one of the most difficult problems for contact centers. When a customer quer...

Enterprise Unified Communications (UC)

"Enterprise Unified Communications brings together all communication strategies of an organization. With a UC, a company c...

icon-close

Enterprise-Grade

Enterprise-grade (sometimes referred to as enterprise-readiness) is an umbrella term that describes a set of features and qualities for software products that are typically required when commissioned in the context of large corporations.

An important aspect of enterprise-grade Conversational AI platforms is scalability, i.e. the capability to operate at high performance under any load. For cost-efficient scaling, hard- and software costs are not the only decisive factor: Intuitive use for business users contributes to low training efforts and easier roll-out across departments and subsidiaries. Developer-friendly architecture ensures agile and incremental evolution of Conversational AI initiatives and transitioning from a project-based approach to a more sustainable product-based model.

Other aspects of enterprise-grade requirements include advanced reporting and monitoring capabilities, serving highest requirements around security, compliance and data protection, supporting complex user management, handling multiple languages in back- and frontend and many more.

Escallate Voice Interaction

Escalation of a query to voice interaction is one of the most difficult problems for contact centers. When a customer query isn't satisfactorily resolved via chat, the interaction is escalated to a voice call with a live agent. It then becomes the goal of the live agent to resolve the query in a single call itself and maintain a good First Call Resolution (FCR) rate for the center. For example, consider a scenario in which a caller uses a chatbot and gets to a point where a live agent needs to be involved. In addition to traditional live chat, this scenario accelerates by transferring the chat conversation to a live phone call with an agent. While a bot can handle several chat conversations in parallel, the agent can only handle one live voice call. This requires smart routing strategies. Hence, chat to voice/phone escalation is a hard industry problem to solve.

Enterprise Unified Communications (UC)

"Enterprise Unified Communications brings together all communication strategies of an organization. With a UC, a company can access multiple communication methods at the same time with its consistent user interface and experience across media types and devices. Employees can get complete access to the company’s tools and security through any device.

With Artificial Intelligence (AI) in the mix, UC can help people focus on their jobs by leaving data interpretation and analysis to the machines. With AI, commands can be given through a simple voice instruction or SMS for the entire system to change settings and take all the next necessary steps on its own. In terms of customer interaction, AI can help unify the various means of engagement such as call centers, online chats, social media, etc."

F

First Contact Resolution (FCR)

First contact resolution (FCR) is a metric used by customer service centers that tracks how well agents can resolve custom...

First Call Resolution Rate / First Contact Resolution

The First Call Resolution Rate or First Contact Resolution (FCR) is a numerical measure of the success rate of a customer ...

icon-close

First Contact Resolution (FCR)

First contact resolution (FCR) is a metric used by customer service centers that tracks how well agents can resolve customer queries in a single interaction. FCR can be measured across all channels of communication; examples of FCR include an email ticket that is resolved with a single reply, a call that is resolved with a single conversation, and a web chat that is resolved in a single session. Resolution may be provided by a human agent or applications that utilize artificial intelligence.

The FCR metric is calculated by dividing the number of queries resolved in a single interaction by the total number of queries. To ensure that the metric accurately reflects FRC, it is also important to follow up with customers a few days after processing their issue to confirm that their issue was resolved.

A high FCR is desirable because it indicates business efficiency and customer satisfaction. Research has shown that increases in FCR result in increased customer satisfaction, decreased operating costs, and increased employee satisfaction. Strategies to achieve a high FCR include agent training, incentive programs, and managing customer expectations.

First Call Resolution Rate / First Contact Resolution

The First Call Resolution Rate or First Contact Resolution (FCR) is a numerical measure of the success rate of a customer service team. The higher the FCR number for a team, the more successful they are at resolving problems in the first contact of an incoming complaint. The formula for calculating it is: FCR = Total No. of Resolved Cases / Total No. of Cases. For example, if a company receives 1,327 calls in a month, out of which 714 are resolved at the first instance itself, the company's FCR rate for the specific month becomes 53.8%. A generally accepted good benchmark for FCR is 70-75%. FCR can be one of the factors that indicate customer satisfaction.

G

GDPR

The General Data Protection Regulation (GDPR) is a legal framework that sets guidelines for data protection and privacy in...

Genesys

Genesys is a global company that specializes in customer experience and call center technologies both on-premises and in t...

Graphical Conversation Editor

A Graphical Conversation Designer is the centerpiece of a low-code Conversational AI user interface and allows managing th...

icon-close

GDPR

The General Data Protection Regulation (GDPR) is a legal framework that sets guidelines for data protection and privacy in the EU. The GDPR was established in May of 2018 and applies across the union; it replaced the Data Protection Directive as the main law outlining how companies must protect personal data of EU citizens.

The GDPR is far more comprehensive and stricter than data protection laws in many other countries, such as the US. The primary goal of the GDPR is to standardize privacy law and provide greater data protection and privacy rights to individuals. The GDPR regulates all aspects of data use, from data collection to data transfer and data destruction. Many consider the GDPR to be the epitome of data protection and privacy guidance; as such, it has become a model for data laws in many other countries such as Japan, Argentina, and South Korea.

Conversational AI applications such as chatbots need to comply with GDPR regulations as they often handle personal end user data. Failure to follow GDPR regulations can result in hefty fines and costs for legal proceedings.

Genesys

Genesys is a global company that specializes in customer experience and call center technologies both on-premises and in the cloud. Genesys serves over 11,000 companies in over 100 countries and implements solutions that impact marketing, sales, and customer service.

One of Genesys’ most-used products is PureEngage; according to Genesys, it is the only omnichannel and multi-cloud customer experience solution for large businesses. PureEngage facilitates customer and employee engagement across all communication channels using artificial intelligence, real-time contextual journeys, intelligent routing, and machine learning. PureEngage is also highly customizable; it is a powerful, flexible tool for large businesses seeking to optimize their operations.

Cognigy.AI seamlessly integrates with the Genesys technology stack and enables contact center automation through deploying powerful virtual agents based on conversational AI.

Graphical Conversation Editor

A Graphical Conversation Designer is the centerpiece of a low-code Conversational AI user interface and allows managing the flow of all conversations in one place. The individual steps are designed in a flow editor which includes easy-to-use design concepts that allow conversation designers to create complex, integrated conversations that are still easy to read for business users.

As a result, conversations can be configurated and deployed flexibly and quickly directly within the editor, making business users agile and self-sufficient without any previous knowledge of coding.

Advanced concepts combine flexible intent recognition with integrated information gathering processes to enable handling complex, non-linear interactions that move between topics while at the same time generating highly structured data that can be automatically processed by the Conversational AI platform or any 3rd party system.

H

Hyperautomation

Hyperautomation synthesizes multiple technologies such as machine learning (ML), artificial intelligence (AI), and Robotic...

icon-close

Hyperautomation

Hyperautomation synthesizes multiple technologies such as machine learning (ML), artificial intelligence (AI), and Robotic Process Automation (RPA) to deliver cutting edge automation solutions that rival, exceed, or enhance human abilities.Hyperautomation may also be referred to as “digital process automation” or “intelligent process automation”.

Hyperautomation has the potential to drastically increase business efficiency, reduce business costs, and increase product development rates. Businesses can use hyperautomation to create intelligent digital workers who can learn over time and execute repetitive task work. As a result, an organization can run lean, human resources can be utilized for more complex tasks, and repetitive tasks can be more consistently and quickly executed.

Gartner, a globally recognized research company, named hyperautomation as a top technology trend for 2020. Since then, hyperautomation has been generating a lot of attention. In upcoming years, hyperautomation is likely to become a key component of industry-leading companies.

I

Interactive Voice Response (IVR)

Interactive voice response (IVR) is a technology that enables machines to interact with humans via voice recognition and/o...

Intent Based Routing

Intent based routing, or intent-based call routing, is a call-assignment strategy used in AI driven call centres to assign...

IP-PBXs

"An Internet Protocol Private Branch Exchange (IP PBX), also known as Unified Communications System or a business phone sy...

icon-close

Interactive Voice Response (IVR)

Interactive voice response (IVR) is a technology that enables machines to interact with humans via voice recognition and/or keypad inputs. IVR systems prompt a user to take a specific action or provide a specific piece of information, such as “how can we help you today?" or “state your date of birth”. The IVR system is typically menu-based and may take a user through multiple steps.

IVR is most notable for the value it brings to customer service. A well designed IVR system can effectively collect information from customers, automate support, prioritize calls, and handle large call volumes. This results in increased business efficiency and cost savings. Additionally, IVR systems enable a business to immediately respond to customer questions and needs, which has a significant positive impact on customer satisfaction. IVR is the ideal technology for businesses seeking to rapidly scale up their customer service operations.

Intent Based Routing

Intent based routing, or intent-based call routing, is a call-assignment strategy used in AI driven call centres to assign incoming calls to the most suitable agent, based on the user's intent. It is a fusion of business administration and IT, powered by AI (Artificial Intelligence), ML (Machine Learning), and network orchestration. Instead of simply choosing the next available agent intent based routing utilizes the user's intent e.g. from a previous bot conversation to select a suitable live agent. It incorporates artificial intelligence (AI), orchestration and machine learning (ML) to automate call distribution to live agents in the contact center. This makes it an enhancement to the automatic call distributor (ACD) systems found in most call centres.

IP-PBXs

"An Internet Protocol Private Branch Exchange (IP PBX), also known as Unified Communications System or a business phone system, is the central switching system for calls within an organization. It handles internal traffic while also acting as a monitor in connections with external networks. An IP PBX could be a physical device or even a software platform. It connects phone extensions to the PSTN (Public Switched Telephone Network) and can also provide additional video, audio, or instant messaging with the TCP/IP protocol stack. It is mainly made up of lines and stations. Lines or trunks are connections to global PSTNs through a telephone company. Stations are phones or other endpoint devices such as modems, fax machines, etc.

A voice gateway solution can be used to bring together AI with the ip based PBX."

K

Kofax

Kofax is a software company that specializes in intelligent, robotic process automation. Kofax serves over 25,000 customer...

icon-close

Kofax

Kofax is a software company that specializes in intelligent, robotic process automation. Kofax serves over 25,000 customers and has over 850 partners. Kofax provides an Intelligent Automation software platform that utilizes robotic process automation and other smart technologies such as cognitive capture, process orchestration, digital messenger, e-signature, and analytics to help take businesses to the next level. Kofax strives to optimize organizations through products that automate repetitive manual tasks, streamline business processes, and improve engagement. Incorporating Kofax software into a business model can reduce process errors and cost, improve customer satisfaction, and help facilitate business growth.

Cognigy.AI seamlessly integrates with the Kofax technology stack and enables simplifying processes through conversational automation and deployment of powerful virtual agents.

L

Language Detection

Language detection describes the capability of a chat or voice bot to flexibly respond based on the language in which the ...

Low Code

Low-code is a software development approach that utilizes graphical interfaces to produce and configure applications. The ...

Low Code IVR

A low-code IVR solution is a type of software which uses intuitive systems to make the creation of the core application ea...

icon-close

Language Detection

Language detection describes the capability of a chat or voice bot to flexibly respond based on the language in which the user chooses to communicate.
Advanced bots allow changing the language of interaction in the middle of the discourse; users can stay in the same conversation flow, and if they are having a hard time communicating, can switch to a more convenient language, allowing for a better user experience.
Language detection is an important differentiator in multilingual countries or regions such as Singapore, Switzerland, India, and many more.

Low Code

Low-code is a software development approach that utilizes graphical interfaces to produce and configure applications. The low-code approach does not require extensive hand-coding or computer programming knowledge. It empowers non-technical business users and domain experts to handle complex tasks that traditionally require a programmer.

Low-code is a valuable approach for organizations because it enables faster software development and allows developers and experts Low-code frees up valuable resources and allows users to easily iterate software within an agile framework. This decreases product time-to-market, enables product scalability, and increases business flexibility.

Low Code IVR

A low-code IVR solution is a type of software which uses intuitive systems to make the creation of the core application easier, while also giving the opportunity for developers to hand-code specific features for further customization. It's the sweet spot between a no-code solution, which only allows customization within the boundaries of the tool's base capabilities, and a complete hand-coding one with VXML (Voice Extensible Markup Language), which requires more time, money and manual effort.

M

Machine Learning (ML)

Machine Learning is a branch of artificial intelligence that enables machines to process data and improve without explicit...

Microsoft LUIS

Microsoft launched the Language Understanding Intelligent Service (LUIS) in 2017. LUIS is a cloud service that enables dev...

Multi-channel call center

"A multichannel call center is a centrally managed and administered platform that brings together various interaction touc...

MRCP protocol

"Media Resource Control Protocol (MRCP) is a communication-based protocol. Speech servers utilize MRCP for speech recognit...

icon-close

Machine Learning (ML)

Machine Learning is a branch of artificial intelligence that enables machines to process data and improve without explicit programming. Via machine learning algorithms, machines learn how to recognize data patterns and make decisions based upon the data they receive.

Machine learning algorithms are typically divided into three categories: supervised learning, unsupervised learning, and reinforcement learning. In supervised learning, labeled data (i.e. example inputs and outputs) are used to train the machine. In unsupervised learning, unlabeled data are used to train the machine to find and generate structure within the data; instead of using examples to map inputs to outputs, the machine is free to learn about patterns in the data based on predefined criteria and objectives such as finding the most important topics within a book. In reinforcement learning, the machine is provided with a goal and receives feedback (i.e. rewards and errors) from the system that help it learn how to maximize performance.

Machine learning has revolutionized many industries in recent years and has become an integral technology in day-to-day life. Search engines, recommendation platforms, and social media all rely on machine learning algorithms. In the context of conversational AI supervised learning is used to continuously improve conversation quality and reduce frictions. By monitoring user inputs and mapping them to predefined intents, virtual agents learn to deal with a broader variety of utterances and paraphrases that occur in human language.

Machine learning will be increasingly relevant in upcoming years due to our increasingly data-based culture. Big data is more prevalent than ever, and organizations need a way to effectively process it. Machine learning enables organizations to quickly analyze large and complex data sets to make better decisions.

Microsoft LUIS

Microsoft launched the Language Understanding Intelligent Service (LUIS) in 2017. LUIS is a cloud service that enables developers to build applications that process human language and recognize user intents. It can understand nuances of natural communication in more than 10 languages and respond appropriately. LUIS has pre-built models for natural language understanding, but it is also highly customizable.

LUIS can be used with any application that communicates with a user to execute a task (chat bots, voice-based applications etc.). LUIS can also be used as a stand-alone NLU to be plugged into any conversational AI platform offering a third party NLU adaptor such as Cognigy.AI.

Multi-channel call center

"A multichannel call center is a centrally managed and administered platform that brings together various interaction touchpoints into a single solution. Contact centers use SMS, emails, online chat, social platforms, as well as phone outreach. The centralization into a multi channel contact center can help in reducing costs, increasing customer satisfaction, and boosting revenues in the long run.

Some multichannel call center solutions also integrate CRM (Customer Relationship Management) systems. In these systems, agents can access a single customer record while communicating with the user on his/her preferred communication channel. This saves the customer’s time spent in repeating their complaint/query when being transferred from one agent to another.

Conversational AI can be deployed to interact with customers on phone, websites, digital assistants, and other channels all at once to enhance multichannel call center operations with automated AI driven conversations."

MRCP protocol

"Media Resource Control Protocol (MRCP) is a communication-based protocol. Speech servers utilize MRCP for speech recognition, speech synthesis, and other services. It depends on other protocols such as Session Initiation Protocol (SIP) and Real Time Streaming Protocol (RTSP) to establish control sessions and audio streams between the server and the client.

MRCP enables the implementation of distributed IVR (Interactive Voice Response) [link to glossary entry] platforms such as VoiceXML interpreters. MRCP defines responses, requests, and events required to control media processing resources, the state machine for every resource, and state transitions for every request and server-generated event."

N

Natural Language Processing (NLP)

Natural language processing (NLP) is branch of technology concerned with interaction between human natural languages and m...

Natural Language Understanding (NLU)

Natural language understanding (NLU) is a subfield of natural language processing that enables machines to understand huma...

Natural Language IVR

A Natural Language IVR (Interactive Voice Response) is a productive, AI-powered collaboration between speech recognition a...

Next Available Agent Routing

A lot of business use an Automated Call Distributor (ACD) which performs next-available-agent routing for their call cente...

Natural Language Generation (NLG)

"The attempt to establish two-way communication with computers has led to quasi-natural speech processes such as Natural L...

Next Best Action

"Contact center agents usually carry out calls as per the organization's prevalent schemes and sales plans. Sometimes, the...

icon-close

Natural Language Processing (NLP)

Natural language processing (NLP) is branch of technology concerned with interaction between human natural languages and machines. NLP utilizes computer science, artificial intelligence, and linguistics to help machines recognize speech and text and respond in a meaningful way. NLP is considered a challenging technology due to the nuances and subtleties of human language, such as sarcasm.

NLP has been around since the 1950's, but with limited ability; it historically relied on extensive hand coding and was far less effective than it is today. With advances in machine learning and increases in computing power and data availability, NLP has become widely used in recent years.

Most people benefit from NLP every day; it is used to filter junk email, convert voicemail to text, and power voice-based assistants. NLP also has uses across many industries such as healthcare, finance, and retail. NLP technology continues to develop quickly, and it will likely be a key component in many complex future applications.

Natural Language Understanding (NLU)

Natural language understanding (NLU) is a subfield of natural language processing that enables machines to understand human language and intent. NLU goes a step beyond speech recognition technology and syntax.uses machine learning to understand nuances such as context, sentiment, and syntax. NLU is designed to be able to understand untrained users; it can understand the intent behind speech including mispronunciations, slang, and colloquialisms.

NLU is a component of many business applications such as chatbots, virtual assistants, and voice bots. NLU helps businesses quickly and easily capture user data and intent and route them to appropriate resources.

Natural Language IVR

A Natural Language IVR (Interactive Voice Response) is a productive, AI-powered collaboration between speech recognition and IVR. It has the ability to ask customers open-ended questions such as "How can I help you today?", adding a personal touch to the interaction. It is different from Speech Recognition IVR which is only designed to absorb keywords. NL IVR can absorb phrases and take the interaction a notch higher with the potential to reduce internal transfer and costs.

Next Available Agent Routing

A lot of business use an Automated Call Distributor (ACD) which performs next-available-agent routing for their call centers. With this system, incoming calls from customers are automatically routed to the next available agent or the agent who has been waiting for the longest time. If there are no agents available, calls are held in waiting queues and an agent is assigned as soon as one is free. This system is giving way to skill-based routing, in which calls are routed to an agent who is specifically skilled in the type of call and available at the time. But the actual improvement upon next-best routing can be brought about with Conversational AI, which can employ Machine Learning to understand which agent, bot, or information can solve the caller's problem best, and even help a human agent with quick information based on the type of query.

Natural Language Generation (NLG)

"The attempt to establish two-way communication with computers has led to quasi-natural speech processes such as Natural Language Generation (NLG). NLG is a software process that converts structured data into natural language. Essentially, it automatically generates speech that describes, summarizes, and explains structured input data in a more comprehensible manner at a speed of thousands of pages per second.

It can be applied in long-form content generation uses such as custom reports for companies, custom content for web or mobile applications, blurb texts for chatbots, e-commerce product descriptions, business data reports, etc. In the future, NLG has wide-ranging applications such as writing descriptions of hotels, locations, etc. for tourism companies, landing pages for corporate ventures, and publishing news articles. The Associated Press has been using automated news articles for a while now."

Next Best Action

"Contact center agents usually carry out calls as per the organization's prevalent schemes and sales plans. Sometimes, these don’t turn out to be optimal from a user's point of view. Here, the NBA (Next Best Action/Activity) strategy can provide perspective and tailor-made customer management strategies to agents.

NBA enables prioritization and consequent selection of the best possible action out of all potential actions for a particular customer, and this action is then placed at the point of interaction in the communication channel. NBA is a shift from a campaign-centric strategy to a customer-centric strategy. It involves 5 pillars.
Activity definition
Selection
Prioritization
Placement
Response

NBA routing strategies can be implemented in Conversational AI using context-aware decision making and agent handover functionalities. The Conversational AI can then make context and situation-aware decisions on how to make progress in conversations. In agent handover scenarios, the Conversational AI can forward context information to the agent desktop [new glossary entry] as an input for NBA recommendation algorithms."

O

OData Analytics

Open Data Protocol (OData) is a protocol for data queries and updates. OData analytics is a category of services that use ...

Omni-channel customer experience

"An omnichannel customer experience includes individual customer touchpoints across channels which are seamlessly connecte...

Outbound call center

An outbound call center serves as an enhancement to the customer experience by helping the enterprise engage with customer...

icon-close

OData Analytics

Open Data Protocol (OData) is a protocol for data queries and updates. OData analytics is a category of services that use OData to create reports and queries for data of interest. Some of the most popular OData analytics services are Azure DevOps Analytics (including Power BI), Google Analytics, and Adobe Analytics.

Analytics services automatically populate with available data; for example, if using Azure DevOps Analytics, all available DevOps data will be populated, and the service will self-update when data changes occur. Analytics services can be used in conjunction with OData queries, which allows users to directly generate queries across an entire organization or multiple projects of interest.

Omni-channel customer experience

"An omnichannel customer experience includes individual customer touchpoints across channels which are seamlessly connected and allows customers to restart from where they exited one channel and continue with the experience on another channel. An omnichannel experience treats customer interactions across multiple channels such as chatbots, social media, and web as a single unified customer journey. As a result, it helps to connect sales, marketing, and customer service over shared goals.

Conversational AI expands on the ability to converse with users on their channel of choice."

Outbound call center

An outbound call center serves as an enhancement to the customer experience by helping the enterprise engage with customers automatically via customer-preferred channels. It offers methods to stand apart from the competition and run revenue increasing campaigns that are lower on cost and higher on effectiveness. Outbound calls made from this center are aimed towards sales pitches, debt collection, proactive customer service, and other objectives. They help sales teams in lead qualification and generation, with the opportunity for prospect identification, data collection, lead analysis based on timing, budget, etc. through cold outreach.

P

Phone Bot

A phone bot is an AI application that can perform a wide range of tasks, from filtering and evaluating clients to actually...

PSTN

PSTN (Public Switched Telephone Network), also called Plain Old Telephone Service (POTS) comprises all the circuit-switche...

Predictive Routing

After next-available-agent routing and skill-based routing came the evolved method of routing customer calls called predic...

Punctuation of STT Transcriptions

Speech-to-text (STT) [link to glossary entry] systems do not include punctuations in their speech recognition outputs by d...

icon-close

Phone Bot

A phone bot is an AI application that can perform a wide range of tasks, from filtering and evaluating clients to actually making and receiving calls. It is applied to a phone line to hold automated phone conversations with callers. Phone bots are also known as voice concierge, telephone bot, or Conversational IVR [link to Conversational AI glossary entry]. They can have meaningful conversations with callers and can be applied to a variety of different use-cases. Right from filtering and evaluating clients to actually making and receiving calls, a telephone bot can do it all. It is powered with the ability to capture keyboard entries, voice, and transcripts and can reply with recorded messages, transfer calls, filter calls, play recordings, book appointments, and make reservations. It can also make routine calls such as appointment reminders, invoice notifications, etc. to customers with utmost ease.

PSTN

PSTN (Public Switched Telephone Network), also called Plain Old Telephone Service (POTS) comprises all the circuit-switched, interconnected telephone networks in the world. Originally an analog system that has now turned digital, PSTN establishes dedicated circuits for the duration of phone calls. This is different from packet switching networks such as internet networks in which a message is divided into packets that are sent individually to devices.

Predictive Routing

After next-available-agent routing and skill-based routing came the evolved method of routing customer calls called predictive routing. This method employs AI and ML (Artificial Intelligence and Machine Learning) upon existing CRM and IVR systems to evaluate which agent should a call be routed to. Conversational AI uses meta information to predict the best outcome for the caller. With the help of sentiment data, algorithms, and contact history to pair a customer call with an agent suited to the personality and current state of the caller. This kind of intelligent system can be useful for handling upset customers, repeat callers, urgent matters, and other high-level problems better.

Punctuation of STT Transcriptions

Speech-to-text (STT) [link to glossary entry] systems do not include punctuations in their speech recognition outputs by default. By enabling the punctuation feature, STT will detect and then insert punctuation in the transcription output, including commas, periods, question marks, and capitalization of the first letter after every period or question mark.

R

Robotic Process Automation (RPA)

Robotic process automation (RPA) is a technology that utilizes robots to automatically execute business processes. Robot w...

icon-close

Robotic Process Automation (RPA)

Robotic process automation (RPA) is a technology that utilizes robots to automatically execute business processes. Robot workers are configured using a low-code approach which makes RPA an easy, low technical barrier solution for many businesses. RPA can mimic most human-computer interactions and is most often used to automate repetitive, labor-intensive tasks. RPA is used across most business sectors for tasks including but not limited to inventory management, data migration, invoicing, and updating CRM data.

RPA has many benefits. Unlike traditional automation, RPA does not require integration across existing applications and does not change the underlying system, which eliminates the need for complex development efforts. RPA also enables repetitive, high-volume tasks to be completed 24/7 with higher accuracy than a human worker could achieve. It frees up valuable human resources to focus on more complex and engaging tasks, resulting in increased employee satisfaction. Investing in RPA typically results in a high ROI because it maximizes an organization’s ability to complete routine work and leverage employee talent.

By combing Conversational AI and RPA organizations can offer services through channels such as voice and chat, along with various social platforms (FB, Slack, etc.) and make automation more intuitive and accessible to their employees and customers.

S

Sentiment Analysis

Sentiment analysis, also referred to as opinion mining, is a method that uses natural language processing and data analyti...

Skill Based Routing

Skills-based routing (SBR) is a system used in call centers that assigns calls to the agent who is most skilled in the con...

Speech Recognition (speech-to-text / STT)

Speech recognition, as the name suggests, is the ability of a computer to recognize the human speech and meta information ...

Speech Synthesis (Text-to-speech / TTS)

Speech synthesis or text-to-speech (TTS) is often used in junction with a speech-to-text (STT) system. While STT converts ...

Speech Translation

Speech translation is an advanced form of text service. It can perform speech-to-speech translation i.e. receive input in ...

SIP Trunk

Session Initiation Protocol (SIP) technology that uses Voice over IP (VoIP). It can initiate, modify, and end sessions wit...

Session Border Controller

"A session border controller is a software component or hardware appliance that empowers SIP-based VoIP (voice over intern...

SIP protocol

"SIP or Session Initiation Protocol is one of the most common protocols in VoIP (Voice over Internet Protocol) technology....

SSML for TTS

"SSML (Speech Synthesis Markup Language) is an XML-based markup language that is used in speech synthesis applications. It...

Speaker verification & identification/Speaker recognition

"Speaker recognition is a set of algorithms that verify and identify speakers through distinct voice characteristics, esse...

Speech Recognition Output

"Speech recognition, also known as automatic speech recognition (ASR), speech-to-text (STT), and computer speech recogniti...

Speech Recognition Metadata

"Speech-to-text (STT) software uses multiple machine learning models to convert spoken audio into text. Each model is trai...

Speech Adaption (STT)

"In the context of speech recognition [link to glossary entry], speaker adaptation refers to a speech recognition system a...

Session Border Controllers (SBC) as load balancer

Every SBC has a built-in routing engine that makes decisions related to sending calls to various destinations. It has vari...

Sentiment Analysis

"Sentiment analysis is also known as emotion AI. It is a technique that uses Natural Language Processing, computational li...

icon-close

Sentiment Analysis

Sentiment analysis, also referred to as opinion mining, is a method that uses natural language processing and data analytics algorithms to extract subjective information from text, such as satisfaction and emotion. Sentiment analysis is often used on customer reviews, social media posts, and other online feedback to measure the public opinion of a product, company, or issue.

Sentiment analysis categorizes text into buckets, commonly “positive”, “neutral”, and “negative”. These buckets can be customized depending on how granular of a result is desired. Buckets can also represent emotional states, such as “happy”, “frustrated”, or “angry”.

Sentiment analysis techniques range from simple and rule-based to complex and driven by machine learning. Advanced techniques are capable of real-time sentiment analysis and more nuanced interpretation of text.

Sentiment analysis has a wide range of applications, including but not limited to tracking trends, monitoring competition, and determining urgency. In conversational AI applications, sentiment analysis can help to optimize interaction between humans and virtual agents to provide better services and retain customers.

Skill Based Routing

Skills-based routing (SBR) is a system used in call centers that assigns calls to the agent who is most skilled in the concern of the call, instead of simply passing it on to the immediately available agent. SBR is an upgrade of ACD (Automatic Call Distributor) system which assigns only one type of calls to an agent unless they manually decide to be reassigned to different types. SBR makes query allocation easier, helps agents hone their specific skills, and lead to much better resolution quality and times.

Speech Recognition (speech-to-text / STT)

Speech recognition, as the name suggests, is the ability of a computer to recognize the human speech and meta information such as sentiment and speaker information, and convert it into a text format. Known by various other names such as automatic speech recognition (ASR), speech to text (STT), and computer speech recognition, it is an amalgamation of linguistics, computer science, and AI. It can be trained to understand a variety of languages through language models.

Speech recognition software breaks down a speech audio into individual sounds or elements, and analyzes each of them with algorithms such as Viterbi search, PLP features, and deep neural networks. This is how it finds the best suited word that fits in the language and transcribes the audio into text.

Speech Synthesis (Text-to-speech / TTS)

Speech synthesis or text-to-speech (TTS) is often used in junction with a speech-to-text (STT) system. While STT converts Speech inputs to text, TTS converts text inputs into an output in a human-like voice (eg. Siri, Alexa, ebook narrators). It uses a speech computer or speech synthesizer for this purpose. Intelligible speech synthesizers can help people with visual impairments and learning disabilities to read texts through voice narrations. Just as other computer-human interaction systems, TTS systems also use specific languages. There are various mark-up languages, old and new, such as Java Speech Markup Language and Speech Synthesis Markup Language [Link to SSML] - these markup languages allow the computer sytem to control the behaviour of the speech output and might support functionalities like to emphasize specific words or distinct phonemes to include distinct units of sound for a specific pronunciation.

Speech Translation

Speech translation is an advanced form of text service. It can perform speech-to-speech translation i.e. receive input in the form of a human voice in one language and process output in voice format in another language. It employs machine translation (MT), automatic speech recognition (ASR), and voice synthesis (TTS) in order to provide output involving meta information. It can also perform speech-to-text translation, making it highly useful for interactions with people from different countries and regions during customer support sessions. Speech translation can be combined with speech recognition so that speech is translated into a target language and then converted into text and even transcribed. In this way, multilingual conversations can be enabled. With the newest technological developments, real time translations have improved. This enables speech translation services to be even applied in near real time applications such as on-the-fly transcriptions.

SIP Trunk

Session Initiation Protocol (SIP) technology that uses Voice over IP (VoIP). It can initiate, modify, and end sessions with multiple parties in an IP network involved in a two-way call or conference call. Basically, an SIP trunk is similar to an analog phone connection. By connecting multiple channels to the office PBX (Private Branch Exchange), one can make local, long-distance, and international calls. These internet calls can be concurrent and unlimited in number. With SIP trunking, scalability is high and controllable by users, and redirecting in times of emergencies is seamless. There is flexibility to call regional and non-regional contacts, and the high costs of ISDN line rentals are also saved.

Session Border Controller

"A session border controller is a software component or hardware appliance that empowers SIP-based VoIP (voice over internet protocol) networks, IP videos, text chats, and other forms of communication. In the context of Conversational IVR, the Session Border Controller (SBC) is an important IT architecture component that builds the foundation for secure, high performant VoIP traffic and routing of connections with the Conversational AI and in particular a voice gateway for automatic speech transcription.

In most cases we do refer to a SBC session as an incoming call. Such a call involves signaling message exchanges and media streams (audio, video, data, call statistics). All streams put together make up a session. A session border controller influences the data flow of sessions.

A border is a point that separates one network from another. A corporate network firewall that separates the local network from the rest of the web is an example of a border. At a higher level, large corporations use filtering routers and other elements to control the data flow between various departments with separate security needs based on location and type of data. A session border controller is in charge of the flow of data in sessions, provides access control, measurement, and data conversion. It also provides security, session routing, and quality of service.

Session border controllers exert control over measurement, data conversion, and access control of the calls they secure. They have applications in SIP trunking, IP contact centers, cloud-based IP communication services, mobile workers, and service provider border security.

In the context of voice AI, the SBC is also seen as an essential component in creating high-performance SIP message termination and call routing that is sometimes deployed in redundant configurations or as load balancer configurations to handle high amounts of parallel conversations. Infrastructure enhancements in underlying docker technology like kubernetes have enabled building of highly scalable SBC setups that adopt based on demand.

https://en.wikipedia.org/wiki/Session_border_controller

SIP protocol

"SIP or Session Initiation Protocol is one of the most common protocols in VoIP (Voice over Internet Protocol) technology. It is a signaling protocol that works with other application layer protocols such as Session Description Protocol (SDP), User Datagram Protocol (UDP), the Transmission Control Protocol (TCP), and the Stream Control Transmission Protocol (SCTP) for controlling internet-based multimedia communication sessions.

A Conversational AI with support of the SIP protocol can initiate, maintain, and terminate real-time voice, video, and messaging sessions. It signals and controls multimedia communications such as internet telephony-based video and voice calls, private IP telephone systems, IP-based instant messaging, and VoLTE.

A Conversational IVR or voice AI system can make use of the SIP protocol for signalling and message handling. SIP endpoints also make it easier to build cognitive voice agents.

https://en.wikipedia.org/wiki/Session_Initiation_Protocol

SSML for TTS

"SSML (Speech Synthesis Markup Language) is an XML-based markup language that is used in speech synthesis applications. It is often embedded in Voice XML scripts for driving interactive telephony systems. TTS (Text To Speech) is a speech synthesis technology that converts written text into sound.

SSML can be applied in TTS requests for enhancing the audio output with customization, details such as pauses, formatting for dates, acronyms, abbreviations, or censored text. SSML can be applied while sending text messages to the Cognigy Voice Gateway, which the Gateway will then forward to the TTS engine."

Speaker verification & identification/Speaker recognition

"Speaker recognition is a set of algorithms that verify and identify speakers through distinct voice characteristics, essentially answering the question, “Who is speaking?” Speaker verification determines whether a speech input belongs to a claimed identity. Both of these processes involve feature extraction, speaker matching, and speaker modeling.

Future enhancements of this technology can lead to personalized experiences for virtual assistants. For example, Amazon Alexa has already started rolling out personalization functionality using speaker recognition. If speaker identification and speaker biometrics functionalities are further developed, they can be used to identify and even authorize a speaker to perform certain actions or enable features."

Speech Recognition Output

"Speech recognition, also known as automatic speech recognition (ASR), speech-to-text (STT), and computer speech recognition is a software that is trained to understand multiple languages, recognize human speech and meta information, and then convert it into text. The input of speech recognition software is speech audio and metadata, while the output is in text format.

The speech output of a speech recognition engine can contain the transcribed text along with the additional meta information that has been extracted from the speech recognition process."

Speech Recognition Metadata

"Speech-to-text (STT) software uses multiple machine learning models to convert spoken audio into text. Each model is trained to recognize a specific characteristic of the audio input, such as the type of file, recording device, number of speakers in the audio, etc. If these details, called recognition metadata, are included in the transcription request, STT can transcribe audio data with greater accuracy. STT can also gain pronunciation assessment features with additional reference text inputs. The customization provided by recognition metadata can be helpful for recognizing ambient noise and industry-specific vocabulary.

Recognition metadata can consist of the following:

The audio’s use case
A 6-digit NAICS code identifying the industry vertical of the file
The speaker’s distance from the microphone
The original audio media
The audio-capturing device (smartphones, PC, vehicles, etc.)
The audio recording device. For eg. ‘Pixel XL','Cardioid Microphone', 'VoIP', or other values
The original audio’s MIME file type (audio/m4a, audio/mp3, audio/x-alaw-basic, etc.)
A subject matter summary of the audio file. For eg. job interview, press conference, public lecture."

Speech Adaption (STT)

"In the context of speech recognition [link to glossary entry], speaker adaptation refers to a speech recognition system adapting to the acoustic features of a particular user based on a small set of utterances from that user. With this feature, a speech-to-text engine can be taught to recognize a specific word or phrase more often than others. For example, the audio input is about bees, and frequently mentions the word ‘bee’. The STT engine has to be instructed, with the help of metadata, to be biased towards recognizing ‘bee’ more often than ‘be’. This is where speech adaptation comes in.

In Conversational AI [link to glossary term], speech adaptation can be applied throughout a conversation. The Conversational AI can pass meta information to the STT engine in a context-aware conversation. For example, if the Conversational AI asks a user for a specific product name, the STT can be directed to search for specific product names, which increases the transcription accuracy."

Session Border Controllers (SBC) as load balancer

Every SBC has a built-in routing engine that makes decisions related to sending calls to various destinations. It has various attributes for determining how to send the calls received, such as priority-based, availability-based, tables and scripts, external engines. Which attribute comes into play depends on the dialed number, called number, time of the day, and other algorithms necessary for routing calls. SBCs can perform load balancing for both VoIP wholesale providers as well as hosted IP PBX providers.

Sentiment Analysis

"Sentiment analysis is also known as emotion AI. It is a technique that uses Natural Language Processing, computational linguistics, text analysis, and biometrics for identifying, extracting, quantifying, and studying subjective information. In contact centers, sentiment analysis is used to analyze customer feedback and accordingly tailor products and services to customers’ needs.

In Conversational AI applications, the sentiment of each user's input can be determined in the reasoning engine and considered for decision making. Using sentiment analysis, an angry customer can be treated differently than a customer who is open for a small talk."

T

Twilio

Twilio is a cloud-based platform that allows developers to add communication capabilities such as video, voice, and messag...

Transfer Target Information

During interactions between customers and AI chatbots, there might be be instances where the issue requires some human int...

TTS Caching

TTS (text-to-speech) caching, often misspelled as TTS cashing, is an administrator controlled feature which enables Conver...

icon-close

Twilio

Twilio is a cloud-based platform that allows developers to add communication capabilities such as video, voice, and messaging to applications. Twilio can support worldwide communications via a software layer that connects global communication networks.

Twilio is used by over one million developers and can be used with almost any software application. In addition to enabling communication in apps, Twilio can be used for tasks such as user authentication and call routing. Twilio enables companies across all industries to revolutionize the way they connect with their customers.

Cognigy and Twilio have partnered to provide powerful conversational AI solutions that cover a broad range of channels and touchpoints.

Transfer Target Information

During interactions between customers and AI chatbots, there might be be instances where the issue requires some human intervention, or where the customer specifically asks to be transferred to a human agent. At such times, the call can be transferred to a live agent or to a different bot with the help of Transfer Target Information mechanism. After the call is transferred, the bot gets disconnected.

The Conversational AI’s transfer activity can forward various attributes such as:

  • - Transfer target URI, which can define where the call is going to be transferred (SIP-based live agent or another bot)
  • - Text description of the transfer request, used for logging and Call Detail Records

There are 2 ways in which this can take place

  • - An SIP INVITE message is sent to the transfer target URI (the Cognigy Voice Gateway remains a part of the transferred call)
  • - An SIP REFER message is sent to the SIP peer (e.g., contact center or SIP service provider) who initiated the original call (the Cognigy Voice Gateway leaves the path after transfer is successful)

TTS Caching

TTS (text-to-speech) caching, often misspelled as TTS cashing, is an administrator controlled feature which enables Conversational AI to check text input with the TTS cache memory and provide the matching cached entry as the output. This enhances the performance of the TTS, reduces processing power, and cuts costs too as TTS vendors charge as per the number of transcribed characters. Once enabled, a TTS cache has a default lifetime of 24 hours, but this can also be configured. By default, this feature is disabled. Once enabled, it can be disabled by the Bot Developer as per text message.

U

UiPath

UiPath is a global company that specializes in software for robotic process automation (RPA). A 2005 start-up with 10 peop...

icon-close

UiPath

UiPath is a global company that specializes in software for robotic process automation (RPA). A 2005 start-up with 10 people, UiPath has grown to approx. 3000 employees, making it the most rapidly growing enterprise software company in history.

UiPath is best known for their industry-leading RPA platform, which utilizes artificial intelligence, machine learning, process mining, and analytics to provide powerful hyperautomation capabilities. The UiPath RPA platform enables organizations to identify automation opportunities, build bots of varying complexity, manage and deploy bots, run tests, communicate with bots, and measure bot performance. UiPath is also known for UiPath Academy, an online platform that offers hundreds of hours of free RPA courses.

Cognigy.AI seamlessly integrates with the UiPath technology stack and enables simplifying processes through conversational automation and deployment of powerful virtual agents.

V

Virtual Agent

A virtual agent is like the human agent's AI-powered work friend. The chatbot found on most websites today is a kind of vi...

Voice Assistant (VA)

A voice assistant (VA) is an intelligent application that uses natural language processing, voice recognition, and voice s...

Voice Automation

Voice automation entails the use of spoken human language to trigger and automate processes in software, hardware, and mac...

Voice Bot

Voice bots are similar to chatbots; both use artificial intelligence to enable machines to communicate with humans in natu...

Voice concierge

In the hospitality industry, a concierge is a person who helps guests make reservations, book tickets to shows, arrange fo...

voice stream (RTP)

"RTP (Real-time Transport Protocol) is a network protocol that delivers voice and video over IP networks through data pack...

Voice Engagement Channel Component

The Voice Engagement Channel Component, based on Cognigy Voice Gateway’s SBC (Session Border Controller) module interfaces...

icon-close

Virtual Agent

A virtual agent is like the human agent's AI-powered work friend. The chatbot found on most websites today is a kind of virtual agent. It can perform the human agent's work for them, and can even partner with them during conversations with customers by providing useful contextual information on the go. Additionally, it can both learn and use knowledge to respond to different kinds of queries. And in the case needs extra help, it can direct customers to a human agent, while continuously working behind the scenes to assist the human agent during live calls.

Voice Assistant (VA)

A voice assistant (VA) is an intelligent application that uses natural language processing, voice recognition, and voice synthesis to communicate with users and execute user requests. Voice assistants are integrated into most of the devices that people use daily – smartphones, computers, speakers, etc. Among the best-known VAs are Apple Siri, Amazon Alexa, Google Home, and Microsoft Cortana.

Voice assistants started to become wildly popular around 2010, when Siri was developed. Other well-known assistants shortly followed, and today more than three billion VAs are in use. While many VAs today are used in a home setting, VAs are also valuable in a business setting. Organizations can use a VA in meetings to take notes and record action items. A VA can also execute simple tasks such as setting up meetings on calendars, creating lists, and finding contact information.

Voice assistants are always improving; they are becoming more intelligent and able to understand more language nuances such as accents and slang. It is expected that VA use will continue to grow in upcoming years as technology continues to improve.

Voice Automation

Voice automation entails the use of spoken human language to trigger and automate processes in software, hardware, and machines. Voice automation also relies on artificial intelligence, which is used to create voice systems that can understand human voice commands and execute tasks accordingly.

Voice automation is commonly used for smart home assistants such as Alexa, Siri, and Google Assistant. However, voice automation also has applications in various sectors of business. Voice automation has been used for everything from aiding software development to improving customer service. As consumers increasingly expect to be able to communicate with businesses and execute tasks via voice command, voice automation will become increasingly prevalent in both business and personal life.

Voice Bot

Voice bots are similar to chatbots; both use artificial intelligence to enable machines to communicate with humans in natural language. Voice bots and chatbots should be able to understand human conversation and respond appropriately. The main difference between voice bots and chatbots is that voice bots process spoken human language and translate it into text, while chatbots process written human language.

Voice bots can be used to take Interactive Voice Response (IVR) systems to the next level. Instead of having to listen to menu options and prompts, users can interact with a voice bot to resolve their specific needs more quickly. A high performing voice bot is nearly indistinguishable from a human; unlike a traditional IVR system, it can understand customer demands, provide solutions, and multitask.

Voice bots can help businesses improve and quickly scale their customer service operations. A voice bot platform can interact with thousands of customers simultaneously, provide personalized support to each, and free up human agents to focus on more complex service issues.

Voice concierge

In the hospitality industry, a concierge is a person who helps guests make reservations, book tickets to shows, arrange for transport, etc. Similarly, a voice concierge is a virtual AI-powered software program that can communicate with users in their spoken language, provide recommendations for food, hotels, shows, etc. and also book tickets. Apart from hospitality, it also finds application in travel and customer service. A voice concierge can provide customers with convenience by being relevant and on-time, offering personalized and content-rich journeys, and the best available, swift services and transactions.

voice stream (RTP)

"RTP (Real-time Transport Protocol) is a network protocol that delivers voice and video over IP networks through data packets structured in a way that they can be delivered at high speeds and then reassembled into a suitable stream that delivers media naturally. It is useful in various applications such as VoIP, svideo conferencing, WebRTC, television, telephony, and web-based push-to-talk features.

In Conversational AI, RTP streams are used in voice processing for voice bot [glossary link] applications."

Voice Engagement Channel Component

The Voice Engagement Channel Component, based on Cognigy Voice Gateway’s SBC (Session Border Controller) module interfaces with voice engagement channels. It inherits all SBC abilities such as SIP interoperability, security, media handling, scalability, and high availability. It processes media (RTP) and SIP signalling traffic to later convert the same into HTTP, to be routed to specific bots in specific frameworks.

W

Watson Assistant

Watson Assistant is a service that enables software developers to create conversational interfaces for applications across...

Webchat

A webchat is a communication channel that allows users to communicate using easy to engage web interfaces that often come ...

WebRTC

"WebRTC, or Web Real-Time Communication makes real-time communication through simple APIs (Application Programming Interfa...

icon-close

Watson Assistant

Watson Assistant is a service that enables software developers to create conversational interfaces for applications across any device or channel. Watson Assistant is cloud-based and has access to Watson AI, which provides machine learning and natural language processing capabilities.

Watson Assistant has three industry-specific solutions included: Watson Assistant for Automotive, Watson Assistant for Hospitality, and Watson Assistant for Industry. Ultimately, all these solutions provide a framework for embedding a virtual assistant that can engage with users and execute tasks such as answering customer questions.

Watson Assistant can be used as a stand-alone NLU as it exposes its functionality via API. This makes it easy for external applications offering third party NLU features such as Cognigy.AI to run their conversation intent mapping from pre-built Watson intents. Watson Assistant is a flexible solution with broad business applications that can be used to streamline operations, provide personalized customer service, and reduce costs.

Webchat

A webchat is a communication channel that allows users to communicate using easy to engage web interfaces that often come in the form of pop-ups at the bottom of a webpage. Webchats can receive text messages and respond intelligently, present visual content and provide interactive inputs in various ways to improve the user experience. Also, they can be designed to seamlessly handover interactions to human agents.

For enterprises, webchat is often a starting point for Conversational AI initiatives. It plugs easily into existing websites and comes with comparatively low impact on infrastructure. Still webchat can empower comprehensive self-service with 24/7 availability and provide very valuable data and insights into customer’s pain points and needs.

WebRTC

"WebRTC, or Web Real-Time Communication makes real-time communication through simple APIs (Application Programming Interfaces) possible for mobile applications and web browsers. It enables audio and video communication within web pages with direct peer-to-peer contact and without installing plugins or apps. It aims to assist in developing high-quality RTC applications for IoT (Internet of Things) devices, browsers, and mobile platforms, and to enable them to communicate with a common protocol set.

In the field of Conversational AI, WebRTC can help in transferring a webchat to a phone conversation. It enables Conversational AI to hold automated phone conversations or transfer the call to a live agent. WebRTC can also enhance real-time communication by enabling live streaming, screen sharing, traditional telephony integration, document sharing, etc."