Creating New Technology for the Data Center or HPC - HPCaaSS (High Performance Compute as a Self Ser

May 30, 2019
3 min read

We haven’t updated our blog recently because we have been working hard on some new technology. When one looks at the efficiency of your data center or HPC(High Performance Compute) center, the numbers are truly awful. Our experience tells us that the utilization of the center is usually around 30%. Imagine if you could only use 30% of your car, or phone or local desktop? Not very good performance. So why do we settle for this in our centers? Well, one reason is infrastructure stacks in data centers are not optimized to fit the vast differences in workloads that are submitted against the static infrastructure configurations. The ability to recompose resources into usable stacks can be very complex and time consuming and is typically beyond the skill set of the user so most centers have very static stacks. To reconfigure these stacks it requires admins and lots of time that is not conducive as well as very expensive and complex to get a stack configured to match the workload. Network fabric technology is very expensive and not as flexible as is needed today to allow these resources to be rapidly configured and deployed. In many cases the network fabric of the center is the bottleneck for implementing what we call re-composable or Rack Scale Design(RSD). More use cases in AI, ML/DL, and Big Data are demanding this flexible capability in the infrastructure so that the jobs/workflows run quicker with higher performance.

Given that, we have been working on two technologies and combining them into a solution to facilitate this capability and put into the hands of end users with an on-demand infrastructure scheduling and provisioning portal. We are calling it HPCaaSS. Intel, Google and many others have been touting the ability to scale out data and HPC centers so that RSD can be facilitated to implement using pools of resources (compute, Storage, Memory, GPUs, etc.) more efficiently. Taking idle servers, flash or GPUs and “moving” them to other stacks as needed on demand will offer much higher utilization rates and speed jobs up across these data and HPC centers. Allowing the users to build a custom infrastructure stack that is flexible to fit their job or workflow, instead of making their job fit the infrastructure, will open up HPCs and datacenters to become more efficient and allow a larger group of users to start consuming these resources.

We have taken an approach of using PCIe as a new network fabric that is showing some disruptive innovation effects. GigaIO (a TSI Vendor) has new patented fabric technology called FabreX based on PCIe that we have combined with Quali’s CloudShell LaaS environment to produce a self service portal that allows end users to build their own stack of resources from the pool of resources in the center, schedule and reserve them, and deploy their job/workflow all by themselves. This offers a much higher utilization rate, allows users to customize the infrastructure to fit their workload/job, and allows admins to easily manage the lifecycle of the center. We think this could help your data or HPC center as well. To learn more reach out to us or click here for our new white paper describing this disruptive innovation. If you would like to view a Video of HPCaaSS in action click here. So if this is of interest, we welcome your thoughts about the need and how best to do this. Hope to hear from you soon!