Until now, the library for PermutationImportance was the library called ʻELI5[ELI5 Official Document](https://eli5.readthedocs.io/en/latest/blackbox/permutation_importance.html). (ELI5 stands for Explain Like I'm 5. (explain me as a 5-year-old).) Recently,Permutation Importance has been implemented from Scikit-Learn 0.22. Until now, I wasn't sure what the features contributed after calculating with the support vector, but now I can see what the features are important with Permutation Importance`.
To put it simply, PermutationImportance selects one of the features and shuffles the values in it to make them meaningless. The accuracy is calculated using the data, the accuracy is compared with the data set of the correct features, and how much the selected features affect the accuracy is calculated.
It was pretty easy to calculate.
Import permutation_importance from sklearn.inspection.
All I had to do was load and calculate the instance ʻoptimized_regr created by optimizing the parameters with ʻoputuna in the support vector and the dataset as an argument to permutation_importance.
#From here sklearn permutation_importance
from sklearn.inspection import permutation_importance
result = permutation_importance(optimised_regr, X_test_std, y_test, n_repeats=10, n_jobs=-1, random_state=0)
#Put the result in a Pandas dataframe and display it
df = pd.DataFrame([boston.feature_names,result.importances_mean,result.importances_std],index=['Featue','mean','std']).T
df_s = df.sort_values('mean',ascending=False)
print(df_s)
I loaded the result into pandas and made a table.
| Featue | mean | std | |
|---|---|---|---|
| 5 | RM | 0.466147 | 0.066557 |
| 12 | LSTAT | 0.259455 | 0.0525053 |
| 8 | RAD | 0.141846 | 0.0203266 |
| 9 | TAX | 0.113393 | 0.0176602 |
| 7 | DIS | 0.0738827 | 0.0178893 |
| 10 | PTRATIO | 0.0643727 | 0.0205021 |
| 6 | AGE | 0.0587429 | 0.010226 |
| 4 | NOX | 0.0521941 | 0.0235265 |
| 2 | INDUS | 0.0425453 | 0.0185133 |
| 0 | CRIM | 0.0258689 | 0.00711088 |
| 11 | B | 0.017638 | 0.00689625 |
| 3 | CHAS | 0.0140639 | 0.00568843 |
| 1 | ZN | 0.00434593 | 0.00582095 |
Until now, it was not possible to know which features were affected by the calculation using the support vector, but now that permutation_importance has been implemented, it is possible to understand which features are affected. I did.
Recommended Posts