August 3, 2004

CLUSTERED STORAGE: MAXIMIZING THE RENDERING CAPABILITIES OF LINUX CLUSTERS

  The Panasas Storage Shelf.

With each new feature film release, the impact of advances in computer-generated animation becomes more evident. From full-length animated features like "Finding Nemo" to special effects augmented films such as the epic "Lord of the Rings" trilogy, actors are sharing top billing with computer systems that tirelessly render characters and effects in scene after scene. Colorful underwater landscapes, elaborate monsters, stunning pyrotechnics, and something as seemingly simple as a character's hair, become more and more realistic with every new release. Since 1995, when Disney/Pixar's "Toy Story" became the first full-length computer-generated feature, Hollywood's production houses have rushed to push the state of the art "to infinity and beyond!" It's a combination of artistic ingenuity, sophisticated computer modeling and animation tools and increasingly powerful rendering engines running on large Linux clusters that allow artists to infuse hit films like "Shrek 2" and Pixar's upcoming "Incredibles" with creativity and imagination that amaze and entertain us.

These Linux-based render farms deliver affordable, scalable and powerful computing capabilities to handle the technical demands of artists and directors, and to deliver increasingly real entertainment products to savvy movie-goers. The continued growth of these clusters, more capable processors, and the drive to develop more realistic models and textures together place an increased burden on traditional SAN- and NAS-based storage systems that serve these clusters. Fortunately, there is a new storage approach available that employs a cluster file system, scalable object-storage architecture and commodity components to deliver the same price-performance benefits for storage as those delivered for computing by Linux clusters.

FOR REAL

Enhanced realism is the ultimate goal for many of these animated and effects-enhanced productions. But enhanced realism requires large model datasets and computationally-intense processing, which generate large, high-resolution outputs. Digital artists, using advanced graphics tools such as Alias' Maya, create three dimensional models of characters, trees, buildings and many other objects that appear in computer-generated scenes. These models are typically created and stored in a shared networked file system, where animators access and modify them to create various scenes. Animators in effect act as "digital directors," placing the various modeled objects within a scene, providing lighting and animation instructions that guide objects' movement, and sending the models and instructions to massive render farms that painstakingly create the scenes a frame at a time. The rendering is computationally intense. Each frame in a scene may require an hour or more of CPU time to complete. Movies are projected at a rate of 24 frames per second; a ten second scene might require hundreds of CPU hours. It's no wonder then that the larger studios have deployed compute clusters consisting of many hundreds, and in some cases, thousands of "nodes." That scene - rendered at high resolution - might consume 5 to 10 gigabytes of storage. Multiply that by the number of scenes actively being worked, by variants in the models or animation instructions (i.e. different "takes"), and by the number of active productions in a large animation shop, and you have some serious storage requirements. It is not uncommon for a large animation shop to have hundreds of terabytes online.

Scalable Linux compute clusters provide the requisite computational capabilities to meet the demands of the most aggressive render farms. The incremental addition of rack-mount servers creates an on-demand model for increasing a rendering farm's capability. Unfortunately, such has not been the case with networked storage systems. Current SAN and NAS-based storage systems are limited in the number of clients (render nodes) they can support, or in the aggregate I/O operations they can serve.

The result is that artists and rendering nodes often have to wait to get the data they need. In order to relieve this bottleneck, many organizations are forced to distribute their data to multiple storage systems - perhaps employing a tiered storage model to contain costs. This process results in the need to move data between distinct "islands of storage" in order to accomplish routine tasks. This can be a time-consuming overhead activity that detracts from the overall productivity of artists and rendering engines alike, and raises the overall cost of production.

GETTING REAL BEHIND THE SCENES

Enter object-based storage, a new networked storage architecture capable of supporting the most demanding digital animation productions. At the core of this architecture are storage objects, fundamental containers that house both application data and an extensible set of storage attributes. Traditional user and application files (models, textures, and rendered frames) are decomposed into a set of storage objects and distributed across one or more "smart disks," called Object-based Storage Devices (OSDs). Like Linux clusters, which perform processing in parallel, object-based storage clusters spread data across the OSDs and provide parallel data access through a standard file system protocol. It is this parallel access that delivers unprecedented performance and scalability - both in bandwidth (MB/s) and I/O operations (IOPS). Capacity - and accompanying performance - is added to the system through the incremental provisioning of additional OSDs, which each contribute processing, memory, disk, and networking bandwidth. The OSDs form a single managed pool of storage where a production operation's entire collection of assets - models, textures, and generated scenes ("digital dailies") - can be housed. And because OSDs are constructed from commodity computing components (Serial ATA disks and Gigabit Ethernet), they lower the per-terabyte cost of installed storage and reduce overall production expenses.

THAT'S A WRAP

Studios will continue to employ Linux compute clusters and powerful modeling and rendering software to produce increasingly realistic computer generated animations. As their productions include more realistic characters, scenes, and effects, high performance storage systems will play an increasingly important role in the production process. Digital animation studios can leverage emerging object-based storage architectures to satiate the growing storage capacity and performance appetite of these compute clusters, resulting in more realistic creations in less time and at lower cost.