Conventional autoencoder networks only use cross-section supervisory control and data acquisition data when monitoring wind turbine conditions, providing insufficient data to the network to learn about the temporal data characteristics. Therefore, a method of monitoring wind turbine gearbox conditions is proposed using a spatiotemporal autoencoder network. First, we use a one-dimensional convolutional neural network cascade bidirectional-long short-term memory network as the encoder layer to abstract the spatiotemporal characteristics of panel data sequentially. Second, input reconstruction errors are used as the warning index to realize online state monitoring. Finally, the results are verified using the actual data of a wind farm in Hebei province. The results demonstrate that, compared with the fault recording time, the spatiotemporal autoencoder network can send the alarm signals 20 days earlier, and the fault detection rate and false alarm times are better than the conventional methods. By analyzing the contribution rate of each component of the reconstruction error, it is observed that the main abnormal parameters of the gearbox fault are oil pressure and oil pool temperature.