Until now, the library for PermutationImportance
was the library called ʻELI5[ELI5 Official Document](https://eli5.readthedocs.io/en/latest/blackbox/permutation_importance.html). (ELI5 stands for Explain Like I'm 5. (explain me as a 5-year-old).) Recently,
Permutation Importance has been implemented from
Scikit-Learn 0.22. Until now, I wasn't sure what the features contributed after calculating with the support vector, but now I can see what the features are important with
Permutation Importance`.
To put it simply, PermutationImportance
selects one of the features and shuffles the values in it to make them meaningless. The accuracy is calculated using the data, the accuracy is compared with the data set of the correct features, and how much the selected features affect the accuracy is calculated.
It was pretty easy to calculate.
Import permutation_importance
from sklearn.inspection
.
All I had to do was load and calculate the instance ʻoptimized_regr created by optimizing the parameters with ʻoputuna
in the support vector and the dataset as an argument to permutation_importance
.
#From here sklearn permutation_importance
from sklearn.inspection import permutation_importance
result = permutation_importance(optimised_regr, X_test_std, y_test, n_repeats=10, n_jobs=-1, random_state=0)
#Put the result in a Pandas dataframe and display it
df = pd.DataFrame([boston.feature_names,result.importances_mean,result.importances_std],index=['Featue','mean','std']).T
df_s = df.sort_values('mean',ascending=False)
print(df_s)
I loaded the result into pandas
and made a table.
Featue | mean | std | |
---|---|---|---|
5 | RM | 0.466147 | 0.066557 |
12 | LSTAT | 0.259455 | 0.0525053 |
8 | RAD | 0.141846 | 0.0203266 |
9 | TAX | 0.113393 | 0.0176602 |
7 | DIS | 0.0738827 | 0.0178893 |
10 | PTRATIO | 0.0643727 | 0.0205021 |
6 | AGE | 0.0587429 | 0.010226 |
4 | NOX | 0.0521941 | 0.0235265 |
2 | INDUS | 0.0425453 | 0.0185133 |
0 | CRIM | 0.0258689 | 0.00711088 |
11 | B | 0.017638 | 0.00689625 |
3 | CHAS | 0.0140639 | 0.00568843 |
1 | ZN | 0.00434593 | 0.00582095 |
Until now, it was not possible to know which features were affected by the calculation using the support vector, but now that permutation_importance
has been implemented, it is possible to understand which features are affected. I did.
Recommended Posts