Abstract:In order to solve the OOV(out of vocabulary) problem in Chinese natural language processing, people often use the fine-grained characteristics of Chinese characters such as strokes, radicals, Pinyin to improve the learning ability of the model. Arround finding the best combination of these features, this paper studied the syllable, first stroke, radical, tone, word frequency, stroke number and other features of Chinese characters by statistical method, and proposed a cross-quadrant mnemonic mapping model which can integrate multiple Chinese characters features. The model can automatically realize the reversible mapping among Chinese characters, words and sequence codes of 26 Latin letters. In the text classification experiment of character-level model, the effect is ideal. In addition, the coding length of the model is moderate, and it retains the readability. It can be used for text annotation in special occasions, and can also provide equal amount of parallel corpus data for Chinese text. So, it is a better auxiliary model in natural language processing.