BlueData Puts ‘Big Data-as-a-Service’ on AWS Following a “directed availability” trial period, BlueData Software Inc. today announced its Big-Data-as-a-Service (BDaaS) platform is now generally available to everyone on the Amazon Web Services Inc. (AWS) cloud. In June, the company announced that its BDaaS platform, previously only available for on-premises installations, was now available to all users of Amazon Web Services Inc. (AWS) cloud services. It was initially in directed availability (targeting a small number of customers to meet expectations). The company’s EPIC software can now be tested for free for a two-week period. It is not yet available in the AWS Marketplace and the company did not indicate when or how it would be. BlueData emphasized the EPIC’s ability to be deployed on-premises or in the cloud. It said that EPIC is the “first and only” BDaaS offering that allows for this. BlueData used embedded and managed Docker containers which allow for portability across different infrastructure. Identical Docker images leveraging projects like Apache Hadoop or Apache Spark can be used in house or in the cloud. The company claims that its AWS offering offers enterprise-class security and cost control for multi-tenant deployments. It allows customers to tap into AWS S3 storage or local storage, such as a Hadoop Distributed Data Service (HDFS data lake). The company stated that the new AWS offering was enhanced by insights from the directed availability program, which was conducted with selected customers. Anant Chintamaneni, an executive, stated that the QA team of a major data integration software vendor wanted AWS to test their new product with different commercial versions (such as Cloudera CDH and Hortonworks HDP) on AWS. BlueData EPIC allowed them to quickly create Docker applications images using their own code and other Hadoop artifacts. This helped to reduce their QA cycle time and improved team productivity. According to Chintamaneni, the key benefits of the product as outlined through the directed availability program are:

  • Administrators and data scientists will enjoy a simplified user experience. This abstracts the AWS-specific infrastructure, allowing them to focus on their Big Data needs.
  • AWS Onboarding is faster for multiple teams and Big Data workloads. This eliminates the need for DevOps expertise, and reduces the cost and time involved.
  • Self-service clusters on Amazon EC2 for Spark and Hadoop, Kafka and Cassandra offer greater flexibility and agility.
  • Reduced AWS costs by using fine-grained resource limits, start/stop controls, cost reporting in multi-tenant environments.
  • Pre-built cluster integrations to Amazon S3 allow for faster time to insights and in-place analysis against on-premises data.
  • Integrating Amazon VPC (including site to-site VPN), Active Directory and Kerberos for authentication, improves data governance

“BlueData is also a BDaaS solution that allows data analysts, developers, and data scientists to work with their data frameworks, including Spark standalone; Hadoop distributions form Cloudera, Hortonworks and MapR; other data frameworks such as Kafka, Cassandra, Jupyter, Zeppelin notebooks, Python and R libraries; and other data science tools and analytics tools,” the company stated in today’s statement.