Abstract:High resolution remote sensing image have been widely used in practical applications, the research of high resolution image semantic segment method has important practical application value. Recently, the remote sensing image annotation method based on the deep convolution network has shown better performance than the traditional methods. However, the context information acquisition method based on the fixed receptive field size does not explicitly use the inter pixel constraint, lead to inconsistent semantic annotation results in the same object internal. Based on the assumption that the pixels in the same region have lager probability to belong to the same category, this paper attempts to introduce a consistency constraint of semantic annotation within the image region to improve the existing depth convolution neural network's ability to describe the context information. On the basis of the existing fully convolutional network model, a loss function representing the consistency of pixel features in the same region is introduced by using the last layer feature of the CNN, combine this loss function with softmax loss function to obtain the network model parameters by joint training. The proposed method is validated on the Vaihingen 2D semantic annotation dataset of ISPRS (International Society for Photogrammetry and remote sensing). The experimental results show that the proposed method achieves better classification results than the existing convolutional neural network models in most categories, and the overall accuracy rate is 85.18%. The fully convolutional network model of regional internal pixel labeling consistency proposed in this paper can effectively capture the context information of the pixel consistency in a region, rectify the classification conflict in traditional FCNs, and get a better consistent classification result, thus improve the image annotation effect.