Interpretable Machine Learning

5.6 Other Interpretable Models

The list of interpretable models is ever-growing and of unknown size. It contains simple models like linear models, decision trees and naive Bayes, but also more complex ones that combine or modify non-interpretable machine learning models to make them interpretable. Especially publications about the latter type of models are currently created with a high frequency and it is hard to keep up with the developments. We only tease a few additional ones in this chapter, especially the simpler and more established candidates.

5.6.1 Naive Bayes classifier

The naive Bayes classifier makes use of the Bayes’ theorem of conditional probabilities. For each feature it computes the probability for a class given the value of the feature. The naive Bayes classifier does so for each feature independently, which is the same as having a strong (=naive) assumption of independence of the features. Naive Bayes is a conditional probability model and models the probability of a class \(C_k\) in the following way:

\[P(C_k|x)=\frac{1}{Z}P(C_k)\prod_{i=1}^n{}P(x_i|C_k)\]

The term \(Z\) is a scaling parameter that ensures that the probabilities for all classes sum up to 1. The probability of a class, given the features is the class probability times the probability of each feature given the class, normalized by \(Z\). This formula can be derived by using the Bayes’ theorem.

Naive Bayes is an interpretable model, because of the independence assumption. It is interpretable on the modular level. For each classification it is very clear for each feature how much it contributes towards a certain class prediction.

5.6.2 K-Nearest Neighbours

The k-nearest neighbour method can be used for regression and classification and uses the closest neighbours of a data point for prediction. For classification it assigns the most common class among the closest \(k\) neighbours of an instance and for regression it takes the average of the outcome of the neighbours. The tricky parts are finding the right \(k\) and deciding how to measure the distance between instances, which ultimately defines the neighbourhood.

This algorithm is different from the other interpretable models presented in this book, since it is an instance-based learning algorithm. How is k-nearest neighbours interpretable? For starters, it has no parameters to learn, so there is no interpretability on a modular level, like in linear models. Also, it lacks global model interpretability, since the model is inherently local and there are no global weights or structures that are learned explicitly by the k-nearest neighbour method. Maybe it is interpretable on a local level? To explain a prediction, you can always retrieve the k-neighbours that were used for the prediction. If the method is interpretable solely depends on the question if you can ‘interpret’ a single instances in the dataset. If the dataset consists of hundreds or thousands of features, then it is not interpretable, I’d argue. But if you have few features or a way to reduce your instance to the most important features, presenting the k-nearest neighbours can give you good explanations.

5.6.3 And so many more …

Many algorithms can produce interpretable models and not all can be listed here. If you are a researcher or just a big fan and user of a certain interpretable method, that is not listed here, get in touch with me and add the method to this book!