Deep neural networks have enabled super-resolution of fluid data, which can successfully expand data from 2D to 3D. However, it is non-trivial to solve the incoherence between the super-resolution frames. In this paper, we introduce a new frame-interpolation method based on a conditional generative adversarial network for smoke simulation. Our model generates several intermediate frames between the original two consecutive frames to remove the incoherence. Specifically, we design a new generator that consists of residual blocks and a U-Net architecture. The generator with residual blocks is able to accurately recover high-resolution volumetric data from down-sampled one. We then input the two recovered frames and their corresponding velocity fields to the U-Net, warping and linearly fusing to generate several intermediate frames. Additionally, we propose a slow-fusion model to design our temporal discriminator. This model allows our adversarial network to progressively merge a series of consecutive frames step by step. The experiments demonstrate that our model could produce high-quality intermediate frames for smoke simulation, which efficiently remove the incoherence from the original fluid data.