Vision navigation is an alternative to Global Position System (GPS) in environments where access to GPS is denied. Yet, the classical feature-based method has to extract feature observations from the image as input, which performs poor in environments with little features. To address this problem, a fusion of feature-based method and direct method is designed. The feature-based method is used in feature rich regions, while the direct method is used in regions with little features. Thus we utilize the fusion of two methods to improve the environmental adaptability of vision navigation. To improve the robustness to outliers, the Huber weight function is applied. Then the nonlinear optimization method is utilized to obtain the optimal camera pose. Experimental results demonstrate that the proposed method can meet the needs of real-time autonomous navigation.