CHAPTER 57
Algorithm Assurance

By Christian Spindler¹

¹CEO, DATA AHEAD ANALYTICS

Artificial intelligence (AI) is penetrating more and more elements of both personal lives and business processes. Especially in financial services – a data-driven business on the one hand, a regulated and privacy-concerned business on the other hand – the request for assurance, interpretability and fairness of AI algorithms rises. Regulatory organizations continue to raise their voice for a better understanding and control of the risks that come with AI.

The Financial Stability Board says that overall, “AI and machine learning applications show substantial promise if their specific risks are properly managed”. In November 2018, the Monetary Authority of Singapore published principles to foster fairness, ethics, accountability and transparency (FEAT) in the application of artificial intelligence and data analytics in the financial sector. The principles demand, for instance, that AI models shall be free of non-intended bias and companies which use AI models shall be responsible for both internally and externally developed components. Swiss regulator FINMA’s Circular 2013/8, states: “Supervised institutions that engage in algorithmic trading (see margin no. 18) must employ effective systems and risk controls to ensure that this cannot result in any false or misleading signals regarding the supply of, demand for or market price of securities. Supervised institutions must document the key features of their algorithmic trading strategies in a way that third parties can understand.”

Today, the application of algorithms in financial services has broadened from the mere algorithmic trading to many elements in the bank’s value chain. Marketing and sales may ask how to use recommender engines to engage with our customers to enhance their experience, reach to new customers or price products. Customer service is concerned with chatbots and natural language processing (NLP) to increase customer satisfaction and retain more customers. Risk and compliance want to know how to use AI models to manage many sorts of risk and ensure compliance in client onboarding and management.

Imagine the loan issuing process in a bank. At various steps in the process, the bank is employing third-party AI-powered software to support the scoring process. Onboarding a new client involves anti-money laundering (AML) and know-your-client (KYC) processes that can be supported by AI. Linked data and graph-based learning, for instance, can detect relationships over several steps of market participants and reveal connections that could turn problematic for a compliant business relationship.

Once onboarded, a loan request issued by a customer typically triggers a rating process for both the applicant and the project to be funded (e.g. consumer loan, mortgage, commercial loan, project financing). Machine learning is successfully employed to identify counterparty risks in both retail and commercial segments, to enable robust scoring even under the constraints of missing information. With a growing number of alternative data sources, e.g. from the Internet of Things (IoT), the performance of scoring algorithms is likely to increase further.

After counterparty scoring and project scoring is performed, pricing and repayment dynamics may be inferred with a separate algorithm based on the risk score on the one hand, and based on factors such as long-term client relationship opportunity on the other. It is particularly hard to model KPIs as the latter, where machine learning with more complex models outplays its biggest strengths, by combining as much relevant data as possible into a model-free prediction function.

This biggest strength turns out to be a weakness if we want to ask how particular decisions have come to exist, or how the overall decision-making process looks in general. As a rule of thumb, the more complex machine learning models are, the less likely are they interpretable in an easily comprehensible manner. For data, a similar rule seems to hold: the larger the scope of a data set (e.g. the more independent data sources are considered), the higher the risk that non-intended biases are undiscovered.

Consequently, the risk of bias in a model’s prediction increases. For machine learning and AI to spread further in the financial industry, we must find ways to measure the risks and assure compliance of algorithms and data.

In Europe, the regulatory risks of AI application are mostly determined by the General Data Protection Regulation (GDPR) as the standard for data protection and privacy within the EU and European Economic Area (EEA), and even addresses the export of personal data outside the EU and EEA areas. GDPR sets rules for automated decision-making (Art. 22 GDPR): users, e.g. bank clients, must explicitly consent to automated decision-making, and have otherwise the right not to be subject to a decision based solely on automated processes, including profiling.

If the client gives consent, they may still ask for interpretability of the AI – or, in words of the Regulation, responsible parties for data processing must inform their clients adequately and must prove GDPR compliance. GDPR also rules about Procurement Assurance Compliance: banks that make decisions based on procured third-party AI, or based on outsourced AI, must carry out due diligence in sourcing the service (Art. 28 GDPR).

So how can we mitigate or minimize the risk of non-compliance and secure AI algorithms and their output against unwanted ethical and compliance issues? Various approaches are currently pursued in practice on algorithms.

Companies are trading off performance against interpretability. White-box algorithms, such as linear models or decision tree models are interpretable per design. The downside is their low performance for complex relationships in the data. Moreover, decision tree models are prone to overfitting, thus are limited in their overall performance. If more complex algorithms, such as neural networks, turn out to yield better performance, companies must apply additional means for interpretation.
The black box nature of such algorithms does not necessarily result from missing understanding of the mathematical function that would drive a model’s decision – in fact, each parameter of a neural network is known in operation. However, knowing the parameters is no longer sufficient to yield a digestible interpretation about which variables in which combination and with which respective weights led to the model’s outcome. Currently applied approaches for interpreting black box models rely on machine learning itself for model interpretation. One such approach uses so-called Shapley values, invented back in 1953, for assigning credit to “players” in a multidimensional game. A method called SHAP approximates Shapley values by cleverly using XGBoost models, making the approach suitable for practical applications.

Another widely known approach is LIME, which aims at reducing model complexity locally at the point of current credit score prediction to a simple, interpretable model. Both approaches are simple in their application but may not always seamlessly adapt to a bank’s business targets. For instance, when businesses want to approve a target of, say, 20% of applicants (“score space”), but the credit model’s actual output comes in so-called “margin-space” of 0 to 1. The non-linear transformation from margin to score space is not straightforward for Shapley values. A satisfactory solution to the problem of interpretability of models in financial applications is still outstanding.
The second major topic concerning AI application is bias risk, which is mostly determined by the discipline of data curation in the early stage of the AI value chain. Various metrics, such as statistical parity difference, equal opportunity difference or Theil Index are useful to measure bias along certain dimensions. De-biasing can be achieved by simple means, such as (a) weighting training data points differently to ensure fairness before classification, or (b) complex approaches such as training a classifier that maximizes performance and simultaneously reduces the ability to determine any protected variable (e.g. gender) from the predictions.
Besides the technological means for algorithm assurance and risk mitigation, ethical design and application of AI is a major concern in banks today. This need is addressed by frameworks such as Algo.Rules that formulate principles for the design of algorithmic systems. With approaches for interpretation, de-biasing and ethical AI design, key elements are or will be in place for algorithm assurance. Finally, is it also possible to monitor and certify the application of this technology and rules? We think it is. For GDPR there are already audit and certification solutions on the market, and AI applications require similar approaches for certification. Robust and transparent algorithm assurance is on the horizon.