The crucial role of explainable AI (XAI) in trustworthy AI applications

Along with its initiative to establish ethical guidelines for trustworthy AI (“Ethics guidelines for trustworthy AI”), in 2018, the European Union implemented an important directive: In the future, the development and use of artificial intelligence should take place under certain, controllable framework conditions. In addition to the establishment of ethical and moral guidelines, the explainability of AI models is of decisive importance. A concept that is repeatedly mentioned in this context is "explainable AI" (XAI) and results from the so-called "black box" problem.

Tricked by your own AI: This is what makes black-box AI models tick

The importance of comprehensible and transparent AI modeling can be illustrated with the following example: In 2017, a system for the automatic recognition of animals was developed on the basis of neural networks. The tool was trained with a dataset consisting of many thousands of horse images and quickly achieved accuracy in image recognition that amazed even the developers. To further develop the model for other types of image recognition, they tried to decipher the complex arithmetic operations of the algorithms - without success. Only later did they accidentally discover the actual reason for the outstanding result: the AI had cheated. It turned out that the horse images in the training data set often contained copyright watermarks. So the AI only jumped as high as it had to and specifically searched for watermarks instead of similar features with horses.

The above example is typical of so-called “black box” problems in the AI domain. The term originates from behaviorism and, applied to the AI sector, states that one can measure the conditions and the outcomes of certain AI models, but the intervening processes of decision making remain hidden. In particularly complex areas of AI and deep learning, black-box models are often used due to the fact that the intervening processes simply can’t be explained.

Explainable AI (XAI) brings light into the black box

Explainable AI is a methodical approach to present the decisions of AI in such a transparent, explainable and interpretable way that a human can understand them. To this end, the XAI concept envisions two different approaches: Either from the beginning and all along the way algorithms must be programmed to act transparently ("ante-hoc"), or they must be made explainable after a certain behaviour was observed ("post-hoc").

In concrete terms, this means that not only the technical functioning of algorithms must be comprehensible, but above all the weighting of individual computing operations in the model itself. Only when it is clear how and why an algorithm has weighted a particular feature more or less heavily in the decision-making process, can the overall model be described and reproduced.

The importance of explainable (and hence fundamentally trustworthy) AI for the trustworthiness and credibility of the entire industry - and thus also for the ever-increasing number of companies that are using AI - can be vividly illustrated using a real-life example: Let's say you want to take out a real estate loan, but your bank rejects the loan application several times. Naturally, you want to know the reasons. But the bank's decision is usually based on a complex rating or scoring model - here people are rarely involved directly. However, if you do not receive a comprehensible explanation of why your application was rejected, not only does the credibility of the bank suffer - you may even lose confidence in the entire system.

In order to avoid a loss of trust in artificial intelligence, the establishment of 100% explainable models is certainly the best way forward, even if it is not always possible without exorbitant effort, especially when a certain level of complexity is reached. The advantages of explainable AI are obvious:

Optimization of AI models: The comprehensibility of AI models is elementary for their continuous improvement.
Positive influence on processes and decisions: AI is predominantly used to make data-driven decisions or improve processes. The quality of decisions and process improvements is directly related to the explainability of AI models.
Identification of bias: In the context of the ethical assessment of AI models, there is much talk about discriminatory AI. This can arise - consciously or unconsciously - even during the programming of the algorithms. There are examples where AI models have been discriminated against based on gender or skin color. Again, explainable AI helps to detect and avoid this so-called "bias" from the outset.
Producing AI standards: With the AIC4 criteria catalog, the German Federal Office for Information Security (BSI) has developed action criteria for auditing trustworthy AI. An important component of the auditing criteria is the explainability of AI models. Companies whose AI is AIC4-audited thus meet minimum standards that create trust among end-users. This establishes better conditions for a widespread usage of AI services.

Ante-hoc approaches for explainable AI (XAI): Using algorithms wisely

In the so-called "ante-hoc" approach, only explainable algorithms are used. In the field of data science, there are some basic models such as regressions, decision trees or random forests, which are constructed comprehensively from the outset.

In addition, so-called "GAMs" (Generalized Additive Models) are used to identify the weighting of variables and to display them on a heat map. This form of visualization makes it particularly easy for people to understand what influence the individual variables have on the overall result.

Another ante-hoc approach is to develop hybrid models. Here, smaller, manageable subtasks are solved by black-box algorithms because they often arrive at the result very quickly. At the same time, transparently programmed algorithms try to open the black box and make the models interpretable.

Post-hoc approaches for explainable AI (XAI): Penetrating the code with a magnifying glass

If we look at the meaning of "Post hoc" in the context of explainable AI, it describes a procedure for explaining to black-box AI models after a certain behaviour was observed. To facilitate this approach, tools can be used already during the training to log the processes. However, depending on the complexity of the model, it can be very difficult even for experts to understand these protocols.

Therefore, additional methods are usually applied in the post-hoc area that scan AI models after they have reached a certain result and try to quantify them. One widely used method is called LIME (Local Interpretable Model-Agnostic Explanations). In the context of XAI it claims to make black-box models explainable to humans who have no prior knowledge of the individual algorithms used. Another approach is called "Counterfactual Analysis" and specifically manipulates the input until the output changes. In terms of effort, however, this is comparable to brute force methods in hacking, where one tries to find out a password, for example, by trying out all possible combinations. The LRP method, on the other hand, is somewhat more targeted and attempts to trace the operations of the AI model back from the output layer by layer.

A relatively new but very promising method is called "rationalization" or post hoc rationalization. It tries to embed black box AI models into an infrastructure that allows explaining their actions in real-time - similar to a mathematician who makes his computational operations transparent to the public during the computational process.

Conversational AI as a prime example of the relevance of explainable AI models

There is hardly an application area where the establishment of trust in AI is more important than indirect human-machine communication. Conversational AI providers such as Cognigy provide virtual assistants in the form of voice and chatbots for more than 400 global companies that conduct millions of fully automated conversations with customers every day. Here explainable AI models are crucial in several ways: First, the transparency of the algorithms influences the conversation design. Virtual assistants are trained to reliably recognize so-called “intents”, i.e., the intentions of customer queries, which allows them to return the correct answer. The more qualified a conversation designer is, the better he/she understands why the algorithms in use make the corresponding decisions. Qualified designers can hence specifically intervene in the flow of conversation and bring about improvements.

On the other hand, especially post-hoc approaches can be used to regularly check the continuously self-optimizing Conversational AI models for neutrality. Avoiding bias is crucial for trust in virtual assistants - after all, no company wants to discriminate against its customers when using voice or chatbots.

Cognigy.AI, one of the world's leading conversational AI platforms has introduced several measures to establish explainability in its AI explainability models, these include:

A low-code conversational editor that can be used not only to develop AI-driven virtual agents but also to check them for possible vulnerabilities in the training data and/or AI model using the integrated intent analyzer.
So-called “snapshots“ that precisely document the development status of a virtual agent. Similar to backups, this allows two different stages to be compared. In this way, changes in the AI models can be identified through an actual code comparison.
Lexicons and intent example sentences to avoid unequal treatment by voice and chatbots. The main focus of this measure is to facilitate correct recognition and processing of natural language (so-called “Natural Language Understanding” or “Natural Language Processing”). Developers and conversation designers are enabled to classify potentially discriminatory language features, e.g. dialects or terms used in a unilaterally masculine way, from the outset in such a way that they are correctly recognized and processed by the AI model.

Due to Cognigy's extensive efforts to make AI models transparent, explainable and interpretable ("explainable AI"), the Conversational AI provider has received AIC4 auditing for its Customer Service Automation platform Cognigy.AI. In addition, Cognigy.AI was also implemented as a central AI control center in the Telekom Cloud and was explicitly AIC4-audited again for this setup. As a pioneer of trustworthy AI, Cognigy has made it its mission to educate enterprises and end-users about the current state of trustworthy AI and for this reason has launched the Cognigy Trust Center.

The crucial role of explainable AI (XAI) in trustworthy AI applications

Tricked by your own AI: This is what makes black-box AI models tick

Explainable AI (XAI) brings light into the black box

Ante-hoc approaches for explainable AI (XAI): Using algorithms wisely

Post-hoc approaches for explainable AI (XAI): Penetrating the code with a magnifying glass

Conversational AI as a prime example of the relevance of explainable AI models

Money Made Simple: AI-Powered Financial Education

How to Safely Use AI in Financial Services: The Power of Composite AI

How & Why to Get Ahead of the 2025 European Accessibility Act with Cognigy

Sign Up Now

SOLUTIONS

PLATFORM

Resources

company

The crucial role of explainable AI (XAI) in trustworthy AI applications

Tricked by your own AI: This is what makes black-box AI models tick

Explainable AI (XAI) brings light into the black box

Ante-hoc approaches for explainable AI (XAI): Using algorithms wisely

Post-hoc approaches for explainable AI (XAI): Penetrating the code with a magnifying glass

Conversational AI as a prime example of the relevance of explainable AI models

Related Articles

Money Made Simple: AI-Powered Financial Education

How to Safely Use AI in Financial Services: The Power of Composite AI

How & Why to Get Ahead of the 2025 European Accessibility Act with Cognigy

Customize Your Cognigy Newsletter

Sign Up Now

SOLUTIONS

PLATFORM

Resources

company