IT News & Events

News about IT at Indiana University and the world


IU Scalable Compute Archive’s march toward sustainable solutions

Given the ubiquity of computers and their attendant programs, interfaces, and apps, chances are we’ve all experienced that comparatively trivial - but still poignant - sense of loss when we find out that our favorite software is no longer supported. For some, the problem might be as simple as learning a new set of commands, but for some scientists, the loss can be far more profound. The Indiana University Scalable Compute Archive (SCA) group within UITS Research Technologies, as part of its mission to enable scientific work, wants to preemptively forestall this frustration by operating mature but aging software using sustainable methods and modern best practices.

The SCA team develops and maintains streamlined web portals which enable researchers to easily and reliably access data archives and custom scientific workflows without direct knowledge of the technical aspects of these processes. Large-scale compute and storage solutions are abstracted behind a point-and-click web interface, such that researchers are simply able to “get the science done” without additional overhead or training. As with all things, such solutions will age as new technology springs up to replace the old, and legacy codebases become increasingly unwieldy to maintain. SCA embraces secure and sustainable solutions so that important tools will remain accessible for years to come.

Arvind Gopu, Manager of the Scalable Compute Archive group in RT

One of SCA’s flagship systems, the One Degree Imager-Portal, Pipeline, and Archive (ODI-PPA) is a complex collection of applications designed to provide astronomers with a single point of access to the data produced by the One Degree Imager (a very powerful camera attached to a very powerful telescope), as well as processing pipelines and visualization tools. While the technology behind this suite was once state-of-the-art, parts of the codebase have been replaced by more modern tools and practices, leading to an increased load for monitoring and maintaining the service on a standard deployment. In order to ensure maintainability for a large codebase that relies on legacy components, SCA has leveraged Docker containers - standard units of software that package everything an application needs to run, from code to settings - to pack each component into an individualized image container.


The WIYN One Degree Image. Photo courtesy of Michael L. Weasner, Copyright ©2017

But, it’s not enough to simply ease the deployment and configuration of this kind of software stack; there are important security concerns with any web-facing application, especially when maintaining a legacy codebase. SCA’s solution uses virtual networks within Docker and funnels web traffic through a secure Nginx proxy; suffice it to say, there’s a reduced attack surface and sensitive components can be isolated from the outside world. The SCA team is also keen to take advantage of advancements in hardware and virtualization, in addition to modernizing a software stack. Jetstream, the National Science Foundation's first production cloud system, provides cloud-based, on-demand computing and data analysis resources, and a pathway to tackle the issue of insufficient hardware refreshment funds.

For more information, check out the conference poster and associated paper at titled “Toward sustainable deployment of distributed services on the cloud: dockerized ODI-PPA on Jetstream” by Raymond Perigo, Arvind Gopu, et al.