Digital Communications and Networks
Nowadays short texts can be widely found in various social data in relation to the 5G-enabled Internet of Things (IoT). Short text classification is a challenging task due to its sparsity and the lack of context. Previous studies mainly tackle these problems by enhancing the semantic information or the statistical information individually. However, the improvement achieved by a single type of information is limited, while fusing various information may help to improve the classification accuracy more effectively. To fuse various information for short text classification, this article proposes a feature fusion method that integrates the statistical feature and the comprehensive semantic feature together by using the weighting mechanism and deep learning models. In the proposed method, we apply Bidirectional Encoder Representations from Transformers (BERT) to generate word vectors on the sentence level automatically, and then obtain the statistical feature, the local semantic feature and the overall semantic feature using Term Frequency-Inverse Document Frequency (TF-IDF) weighting approach, Convolutional Neural Network (CNN) and Bidirectional Gate Recurrent Unit (BiGRU). Then, the fusion feature is accordingly obtained for classification. Experiments are conducted on five popular short text classification datasets and a 5G-enabled IoT social dataset and the results show that our proposed method effectively improves the classification performance.
Luo, Xiong; Yu, Zhijian; Zhao, Zhigang; Zhao, Wenbing; and Wang, Jenq-Haur, "Effective Short Text Classification via the Fusion of Hybrid Features for IoT Social Data" (2022). Electrical Engineering & Computer Science Faculty Publications. 516.
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.