Data Explosion Calls for Automated Storage and Data Management

Agility and scale call for data management across any storage environment.

The IT Press Tour had the opportunity to meet with StrongBox Data Solutions to learn about their StrongLink software and understand the business problems it solves.

Given the exponential expansion of data, there is no one-size-fits-all storage solution. This results in storage and vendor silos, more manual processes, higher costs, reduced efficiency, difficult collaboration, and painful data migrations.

The result is data and storage questions. What data do I have? Where is it? How can users access it? Is it protected? Is it compliant? Where should it be now? Depending on your perspective, this is either a storage management problem, a data management problem, or both.

StrongLink provides a solution by bridging storage silos and automating cross-platform data management. It aggregates file system metadata, selected rich-file headers, and user-created metadata. These power policy engines automate routine IT operations, identify and rectify serious issues, streamline collaboration between groups and teams, maximize the ROI of IT resources, and provide horizontal scalability.

StrongLink connects to storage both in and out of the data path. There are multiple ways to access and manage files. Users have multi-protocol access to files on any storage. Users may access files in the global namespace through SMB/CIFS, NFS, S3 in addition to the StrongLink CLI and API. All file actions are the same as if directly accessing storage. Policy-based data movement is transparent to users.

One solution controls multi-vendor, multi-platform storage environments for data movement, production, optimization, and workflow automation. There is automated file copy management for versioning, self-service recovery, and cross-platform protection.

StrongLink's reference architecture provides multi-node for self-healing high availability. The number of StrongLink nodes may be increased at any time based on I/O requirements. StrongLink software can be installed on any qualified hardware or VM.

StrongLink also provides data protection choices with disaster recovery (DR) between primary data centers, public clouds, WANs, and DR sites.

Use Cases

NASA Research Data Management Project needed to provide global access to distributed datasets for thousands of researchers working on hundreds of projects. The goal was to provide a central catalog of all digital assets across all storage types with the ability to tag all files with multiple types of identifying information and classification levels. StorageLink Global Namespace served as a seamless abstraction layer across existing storage. This enabled database mapping between StrongLink and Elasticsearch-based applications. Users access existing NASA applications via API for global queries, initiate data movement, and add rich metadata. This enables the addition of new storage types and storage resource management centrally without interrupting users as well as expansion to the cloud.

German Climate Computing Center (DKRZ) needed to decommission HPSS and replace proprietary HPSS tape format with open standard LTFS for 150PB of research data. They wanted to expand the performance and scale of online and offline research data storage management to accommodate 120PB per year in HPC workflows. While preserving existing workflows with the new system with no more than 5 days downtime during the transition. StrongLink ingested the HPSS database in two days making data visible to researchers across both HPSS and LTFS formats. New data is written in LTFS as HPSS physical migration to LTFS happens in the background. 1PB cache provided Stronglink 3FS with NVMe and a high-performance disk. Researcher workflows were maintained with StrongLink CLI.

Autonomous Vehicle Research Data Management to handle extreme growth in research data. There was 120PB of data on Isilon with more than 30PB on tape. The project started at 150PB per day and is now growing at 2PB per day and rising. There was a desire to accommodate increasing data growth without expanding Isilon clusters. StrongLink conducts policy-based offload from Isilon direct-to-tape at a minimum of 2PB per day. This ensures the ability to expand to higher throughput when needed, scaling to multiple exabytes over five years. They are now providing continual offload from Isilon direct-to-tape as new data arrives with no need for additional disk cache. Tapes are formatted with open-standard LTFS. They have ncreased to a total of 80 drives of mixed types. StrongLink scaled to handle both daily user access to data plus 2PB per day ingest of new data to tape.

U.S. Library of Congress wanted to replace its legacy content strategy with a new global platform to provide unified access to and understanding of all digital assets. They needed to consolidate up to 50 different data silos of IBM, EMC, Oracle, and other storage into a single global namespace. Also, catalog and replicate data to five new data centers including two in AWS. StrongLink provided the content abstraction layer that provided full access to all users. New data is replicated across all data centers. New data is automatically cataloged and replicated across all locations. Storage costs and operational expenses have been reduced.

Drop Me a Line, Let Me Know What You Think

© 2020 by Tom Smith | | @ctsmithiii