ASE2021 PyExplainer Live-Demo

Explaining instances with PyExplainer is simple. First, we need to load necesarry libraries as well as preparing datasets.

## Load Data and preparing datasets

# Import for Load Data
from os import listdir
from os.path import isfile, join
import pandas as pd

# Import for Split Data into Training and Testing Samples
from sklearn.model_selection import train_test_split

train_dataset = pd.read_csv(("../../datasets/lucene-2.9.0.csv"), index_col = 'File')
test_dataset = pd.read_csv(("../../datasets/lucene-3.0.0.csv"), index_col = 'File')

outcome = 'RealBug'
features = ['OWN_COMMIT', 'CountClassCoupled', 'AvgLine', 'RatioCommentToCode']
# OWN_COMMIT - # code ownership
# Added lines - # of added lines of code
# Count class coupled - # of classes that interact or couple with the class of interest
# RatioCommentToCode - The ratio of lines of comments to lines of code

# process outcome to 0 and 1
train_dataset[outcome] = pd.Categorical(train_dataset[outcome])
train_dataset[outcome] = train_dataset[outcome].cat.codes

test_dataset[outcome] = pd.Categorical(test_dataset[outcome])
test_dataset[outcome] = test_dataset[outcome].cat.codes

X_train = train_dataset.loc[:, features]
X_test = test_dataset.loc[:, features]

y_train = train_dataset.loc[:, outcome]
y_test = test_dataset.loc[:, outcome]


class_labels = ['Clean', 'Defective']

X_train.columns = features
X_test.columns = features
training_data = pd.concat([X_train, y_train], axis=1)
testing_data = pd.concat([X_test, y_test], axis=1)

Then, we construct a Random Forests model as a predictive model to be explained.

(1) Please construct a Random Forests model using the code cell below.

Tips


rf_model = RandomForestClassifier(random_state=0)
rf_model.fit(X_train, y_train)  

from sklearn.ensemble import RandomForestClassifier

# Please fit your Random Forests model here!

PyExplainer

PyExplainer [PTJ+21] is a rule-based model-agnostic technique that utilises a local rule-based regression model to learn the associations between the characteristics of the synthetic instances and the predictions from the black-box model. Given a black-box model and an instance to explain, PyExplainer performs four key steps to generate an instance explanation as follows:

  • First, PyExplainer generates synthetic neighbors around the instance to be explained using the crossover and mutation techniques

  • Second, PyExplainer obtains the predictions of the synthetic neighbors from the black-box model

  • Third, PyExplainer builds a local rule-based regression model

  • Finally, PyExplainer generates an explanation from the local model for the instance to be explained

Tips


import numpy as np
np.random.seed(0)

pyexp = PyExplainer(X_train = X_train,
                           y_train = y_train,
                           indep = X_train.columns,
                           dep = outcome,
                           blackbox_model = rf_model)

# PyExplainer Step 2 - Generate the rule-based explanation of an instance to be explained
exp_obj = pyexp.explain(X_explain = X_test.loc[file_to_be_explained,:].to_frame().transpose(),
                        y_explain = pd.Series(bool(y_test.loc[file_to_be_explained]), 
                                                      index = [file_to_be_explained],
                                                      name = outcome),
                        search_function = 'crossoverinterpolation',
                        max_iter=1000,
                        max_rules=20,
                        random_state=0,
                        reuse_local_model=True)

# Print rule 
exp_obj['top_k_positive_rules'][:1]

# Please use the code below to visualise the generated PyExplainer explanation (What-If interactive visualisation).
pyexp.visualise(exp_obj, title="Why this file is predicted as defect-introducing?")

# Import for PyExplainer
from pyexplainer.pyexplainer_pyexplainer import PyExplainer

file_to_be_explained = 'src/java/org/apache/lucene/index/DocumentsWriter.java'

# PyExplainer Step 1 - Construct a PyExplainer 


# PyExplainer Step 2 - Generate the rule-based explanation of an instance to be explained


# Print rule 


# Please use the code below to visualise the generated PyExplainer explanation (What-If interactive visualisation).
# Print rule 
# Please use the code below to visualise the generated PyExplainer explanation (What-If interactive visualisation).