The MPI/spl I.bar/Barrier() call can be crucial for several applications and has been target of different optimizations since several decades. The best solution to the barrier problem scales with O(log/sub 2/N) and uses the dissemination principle. A new method using an enhanced dissemination principle and inherent network parallelism is demonstrated in this paper. The new approach was able to speedup the barrier performance by 40% in relation to the best published algorithm. It is shown that it is possible to leverage the inherent hardware parallelism inside the InfiniBand/spl trade/ network to lower the latency of the MPI-Barrier() operation without additional costs. The principle of sending multiple messages in (pseudo-) parallel can be implemented into a well known algorithm to decrease the number of rounds and speed the overall operation up.