Abstract:To solve the problem of low classification accuracy and high computational cost in the qualitative data environment, a classification variable identification method was proposed to improve the classification separability by using traditional classifiers and different mapping techniques. By mapping the initial feature (classification attribute) to the real domain space and using the chi-square (C-S) as the measure of difference, the dimension of the feature space is increased to improve the class separability. The t-distributed domain embedding algorithm (tSNE) is used to reduce the dimension of the data to two or three features, thus reducing the calculation time of the learning method. It is proved by experiments on the common classification data set that C-S mapping and t-SNE not only guarantee the recognition accuracy, but also greatly reduce the computation of recognition task. At the same time, when only C-S mapping is applied to the data set, the separability of categories is enhanced, thus significantly improving the performance of the learning algorithm.