Deep learning has gained popularity in a variety of computer vision tasks. Recently, it has also been successfully applied for hyperspectral image classification tasks. Training deep neural networks such as a convolutional neural network (CNN) for classification requires a large number of labeled samples. However, in remote sensing applications, we usually only have a small amount of labeled data for training because they are expensive to collect, although we still have abundant unlabeled data. In this paper, we propose semi-supervised deep learning for hyperspectral image classification - our approach uses limited labeled data and abundant unlabeled data to train a deep neural network. More specifically, we use deep convolutional recurrent neural networks (CRNN) for hyperspectral image classification by treating each hyperspectral pixel as a spectral sequence. In the proposed semi-supervised learning framework, the abundant unlabeled data are utilized with their pseudo labels (cluster labels). We propose to use all the training data together with their pseudo labels to pre-train a deep CRNN, and then fine-tune using the limited available labeled data. Further, to utilize spatial information in the hyperspectral images, we propose a constrained Dirichlet process mixture model (C-DPMM), a non-parametric Bayesian clustering algorithm, for semi-supervised clustering which includes pairwise must-link and cannot-link constraints - this produces high-quality pseudo-labels, resulting in improved initialization of the deep neural network. We also derived a variational inference (VI) model for the C-DPMM for efficient inference. Experimental results with real hyperspectral image datasets demonstrate that the proposed semi-supervised method outperforms state-of-the-art supervised and semi-supervised learning methods for hyperspectral classification.