Compiler and Runtime Optimization Techniques for Implementation Scalable Parallel Applications