Learn about Hydrolix's streaming data lake platform: Revolutionizing big data economics with real-time processing, multi-year retention, and Kibana integration.
In the ever-evolving landscape of big data and analytics, companies constantly need help navigating the need for comprehensive data insights and the escalating data storage and processing costs. Hydrolix, a Portland-based streaming data lake company, is addressing this challenge head-on by reimagining big data storage and analysis economics. At a recent IT Press Tour in Boston, Hydrolix executives shared insights into their technology, growth, and vision for the future of data lakes.
The Hydrolix Advantage
Founded in 2018, Hydrolix has quickly gained traction in the market, boasting nearly 300 customers and a tenfold growth in the past year. The company's streaming data lake platform is designed to handle massive volumes of data - up to hundreds of terabytes per day - while offering cost-effective storage and real-time query capabilities.
Marty Kagan, CEO and co-founder of Hydrolix, explained the company's genesis: "At my previous company, we were generating close to 2-3 billion records a day, and the infrastructure required to ingest, store, and search that data was tremendously expensive. We realized there had to be a better way to handle data at that scale in a more cost-effective manner."
Key Features for Developers
1. Scalable Ingest and Real-Time Processing
Hydrolix's platform can handle ingest rates of up to 20 million events per second, making it suitable for high-volume, real-time data streams. This capability is precious for industries dealing with massive data influxes, such as media streaming, gaming, and IoT.
2. Multi-Year Hot Storage
Unlike traditional data lake solutions, which often tier data to cold storage after a short period, Hydrolix enables companies to keep data "hot" and queryable for extended periods—typically 15 months or more. This feature is crucial for long-term trend analysis, compliance requirements, and machine learning model training.
3. Flexible Schema and ETL on Ingest
The platform supports flexible schemas and performs ETL (Extract, Transform, Load) operations in real-time as data is ingested. This capability allows developers to combine multiple data sources into a single table, simplifying downstream analytics and reducing the need for complex data pipelines.
4. SQL and Spark Query Interfaces
Hydrolix provides SQL and Spark query interfaces catering to user preferences and use cases. This flexibility allows data scientists to work with familiar tools while enabling business analysts to perform ad-hoc queries using standard SQL.
5. Kubernetes-Native Architecture
Built on a Kubernetes-native architecture, Hydrolix offers independent scaling of ingest, query, and storage components. This design allows for efficient resource utilization and cost optimization, which is particularly important for handling variable workloads.
Real-World Applications
Hydrolix's platform is being used across various industries for multiple use cases:
1. Media and Entertainment
One of Hydrolix's notable clients, Paramount, uses the platform to handle peak ingestion rates of 10.8 million rows per second during significant events like the Super Bowl. This capability allows real-time monitoring and analysis of streaming performance and viewer behavior.
2. E-commerce and Retail
New Balance, a global athletic footwear and apparel company, leverages Hydrolix through Akamai's TrafficPeak observability platform. This integration enables New Balance to gain real-time insights into their online advertising campaigns, website performance, and user behavior across various digital channels.
3. Cybersecurity
Elkjøp, a Nordic electronics retailer, used Hydrolix to detect and mitigate a DDoS attack during Black Friday, showcasing the platform's real-time threat detection and response capabilities.
4. Compliance and Auditing
With its ability to store and query large volumes of data for extended periods, Hydrolix is well-suited for compliance use cases, such as PCI DSS, which requires storing and searching transaction data for up to a year.
The Hydrolix Ecosystem
Hydrolix is building a robust ecosystem of partners and integrations to enhance its value proposition:
1. Cloud Provider Partnerships
While Akamai is currently Hydrolix's most prominent go-to-market partner, the company is actively working on expanding its presence with other major cloud providers. This strategy aims to make Hydrolix a native service across multiple cloud platforms.
2. Data Visualization and Analytics Tools
Hydrolix supports various data visualization and analytics tools, including Grafana, Looker, and Superset. Recently, the company announced a partnership with Quesma to enable Kibana integration. This allows users of the famous ELK (Elasticsearch, Logstash, Kibana) stack to leverage Hydrolix's cost-effective storage while maintaining their existing dashboards and queries.
3. Data Ingestion and Processing
The platform supports ingestion from various sources, including Kafka, Kinesis, and HTTP streaming endpoints. It also integrates with popular data shippers like Logstash and Beats, making it easy for organizations to adopt Hydrolix without significantly changing their existing data pipelines.
Addressing the AI and ML Data Challenge
As organizations increasingly leverage AI and machine learning, the demand for large-scale, long-term data retention is growing. Hydrolix is positioning itself as an ideal solution for this challenge, enabling companies to store and query vast amounts of historical data cost-effectively.
"We're seeing ML use cases drive the need for more data retention," explained David Sztykman, Director of Product at Hydrolix. "Before, companies might keep data for 30 or 90 days. Now, they need all the data, and they need it to be queryable for model training and validation."
Future Outlook
Looking ahead, Hydrolix is focusing on several key areas:
1. Expanding Global Reach
The company is expanding its presence in Asia, Europe, and other regions, hiring local sales and support teams to serve its global customer base better.
2. Enhancing Cloud Provider Integrations
While already available on major cloud platforms, Hydrolix is working on deeper integrations to make deployment and management even more seamless for cloud-native organizations.
3. Actionable Insights
The Hydrolix team is exploring ways to help companies visualize data and take automated actions based on insights derived from their data lakes.
4. Expanding Ecosystem Partnerships
Continuing to build out its partner ecosystem is a key priority. The goal is to make Hydrolix the go-to backend for next-generation observability and analytics platforms.
Conclusion
For developers, engineers, and architects working with big data, Hydrolix presents an intriguing solution to the perennial challenge of balancing data retention with cost-effectiveness. Its ability to handle massive ingest rates, perform real-time ETL, and enable long-term hot storage at a fraction of the cost of traditional solutions makes it a compelling option for organizations dealing with high-volume, high-velocity data.
As data volumes grow exponentially and the need for real-time insights becomes ever more critical, platforms like Hydrolix are poised to play a crucial role in the future of data infrastructure. By dramatically reducing the cost of data storage and analysis, Hydrolix is enabling organizations to retain and leverage more of their data, potentially unlocking new insights and capabilities that were previously economically unfeasible.
Whether you're building a real-time analytics pipeline, developing an AI-powered application, or simply looking to optimize your organization's data infrastructure costs, Hydrolix's innovative approach to data lakes is worth exploring. As the company expands its ecosystem and capabilities, it may become a key player in shaping the future of big data analytics and observability.
Comments