The Hierarchical Schur Complement method (HSC), and the HSC-extension, have significantly accelerated the evaluation of the retarded Green's function, particularly the lesser Green's function, for two-dimensional nanoscale devices. In this work, the HSC-extension is applied to determine the solution of non-equilibrium Green's functions (NEGF) on three-dimensional nanoscale devices. The operation count for the HSC-extension is analyzed for a cuboid device. When a cubic device is discretized with $N \times N \times N$ grid points, the state-of-the-art Recursive Green Function (RGF) algorithm takes $\mathcal{O}(N^7)$ operations, whereas the HSC-extension only requires $\mathcal{O}(N^6)$ operations. %Realistic operation counts also depend on the system dimensions in $xyz$-directions and the form of contact self-energy matrix. Operation counts and runtimes are also studied for three-dimensional nanoscale devices of practical interest: a graphene-boron\- nitride-graphene multilayer system, a silicon nanowire, and a DNA molecule. The numerical experiments indicate that the cost for the HSC-extension is proportional to the solution of one linear system (or one LU-factorization) and that the runtime speed-ups over RGF exceed three orders of magnitude when simulating realistic devices, such as a graphene-boron nitride-graphene multilayer system with 40,000 atoms.