From Verify.Wiki
Jump to: navigation, search
Type Private
Industry Data Science
Founded 2011
Headquarters Palo Alto, CA, United States
Key people Jonathan Gray (Founder & CEO)
Nitin Motgi (Founder & CTO)
Andreas Neumann (Chief Architect)
Vikram Bhan (COO) [1]
Investors Safeguard Scientifics, Battery Ventures, Ignition Partners and Andreessen Horowitz
Number of employees 50
Related Certifications Certificate in Data Science Industry Overview

Cask, formerly Continuuity, is a cloud-based Big Data application platform that helps to build big data applications for the developers. It allows developers to build, deploy, and manage Big Data applications on top of components within the Hadoop ecosystem instead of providing a cloud service for writing and running Hadoop programs.
The Cask Data Application Platform (CDAP) enables developers to build data applications on top of Hadoop and other sources in days or weeks. It is an abstraction layer on top of Hadoop and other open source infrastructure such as HBase, Hive, Tephra, and Tigon. This capability reduces the cost and complexity of developing and managing applications for Hadoop and improves time to value. It creates reusable abstractions and integrates the underlying Hadoop and Spark infrastructure technologies and also provides simple and easy-to-use API's and a graphical UI to build, deploy, and manage complex data analytics applications on-premises or in the cloud.[2]
CDAP has been certified by Cloudera, Hortonworks and MapR.The company is backed by leading investors including Safeguard Scientifics, Battery Ventures, Ignition Partners and Andreessen Horowitz. [3]


Cdap arch.png

Cask has its product The Cask Data Application Platform (CDAP) with Cask Hydrator and Cask Tracker as extensions.

The Cask Data Application Platform (CDAP)

CDAP enhances the production Hadoop and Spark applications with enterprise-class governance capabilities, portability and security, CDAP provides organizations with a standardized platform backed by a broad set of choices for production deployments on-premises and in the cloud. It provides a container architecture for the data and applications on Hadoop which increases productivity and quality in order to accelerate development and reduce time-to-production. The three main containers include :

  • Data Containers - CDAP Datasets provide a standardized, logical container and runtime framework for data in varied storage engines. They integrate with other systems for instant data access and allow the creation of complex, reusable data patterns.
  • Program Containers - CDAP Programs provide a standardized, logical container and runtime framework to compute in varied processing engines. They simplify testing and operations with standard lifecycle and operational and can consistently interact with any data container.
  • Application Containers - CDAP Applications provide a standardized packaging system and runtime framework for Datasets and Programs. They manage the lifecycle of data and apps and simplify the painful integration and operation processes in heterogeneous infrastructure.

CDAP provides these essential capabilities:

  • Abstraction of data in the Hadoop environment through logical representations of underlying data;
  • Portability of applications through decoupling underlying infrastructures;
  • Services and tools that enable faster application creation in development;
  • Integration of the components of the Hadoop ecosystem into a single platform; and
  • Higher degrees of operational control in production through enterprise best practices.

[4] [5]

Cask Hydrator

Cask Hydrator is a self-service, reconfigurable, extendable open source framework to visually develop, run, automate, and operate data pipelines. It enables developers, data engineers and data scientists to get started quickly with data ingestion, exploration, and transformation capabilities available through a rich, self-service user-interface, extensive REST APIs, and an interactive shell. The Hydrator interface integrates with CDAP, allowing drill-down debugging of pipelines and the creation of metrics dashboards to closely monitor pipelines.[6]

Cask Tracker

Cask Tracker is a self-service CDAP Extension that automatically captures rich metadata and provides users with visibility into how data is flowing into, out of, and within a Data Lake. It allows them to perform impact and root cause analysis, and provides an audit-trail for auditability and compliance. It enables IT to oversee changes, while delivering trusted, secured data in a complex Data Lake environment. Tracker provides access to structured information that describes, explains, locates, and makes it easier to retrieve, use, and manage datasets. It offers data engineers and data scientists a simple UI to discover data and track it's provenance, audits and metadata. [7]


No Controversies


  • 2011 - Cask was founded
  • 2014 - Cask, Formerly Continuuity, Goes Open Source
  • Feb 2015 - Cloudera links up with Hadoop developer Cask
  • Nov 2015 - Cask Announces $20 Million Series B Financing Led by Safeguard Scientifics
  • Jun 2016 - Cask Receives Strategic Investment From Ericsson

Top 5 Recent Tweets

November 22, 2018kenlin0109@CaskData @googlecloud from herself, was at

Top 5 Recent News Headlines

  • Cask Receives Strategic Investment From Ericsson - The company said that the minority investment will help accelerate delivery of its Apache Hadoop based solutions and services for a range of vertical industries. [8]
  • Cask Data Named a "Cool Vendor" in Pervasive Integration, 2016 by Gartner - Vendors Selected for the "Cool Vendor" Report Are Innovative, Impactful and Intriguing. [9]
  • Cask Data collects $20M to help devs brew packaged apps on Hadoop - Cask secured a $20 million in a Series B financing round led by Safeguard Scientifics, Inc. The round, which also saw participation from Battery Ventures, Ignition Partners and other existing investors, brings the three-and-a-half-year old startup’s total funding to $32.5 million following a Series A round back in November 2012. [10]
  • Cask extends Big Data app development platform - Cask has updated the platform and expanded beyond Hadoop through a new partnership with Cassandra company DataStax. [11]
  • Big data startup Continuuity becomes Cask and is now completely open source - Continuuity, the big data PaaS startup that’s been busy open sourcing various pieces of technology over the past several months, is taking the plunge and going totally open source and is changing its name to Cask. The company’s flagship technology, Continuuity Reactor is also getting a name change and will be referred to as the Cask Data Application Platform (CDAP). [12]

Top 5 Lifetime News Headlines

  • Hadoop Market Growth Forecast at 59.37% CAGR to 2020 - The analysts forecast global Hadoop market to grow at a CAGR of 59.37% during the period 2016-2020. The report covers the present scenario and the growth prospects of the global Hadoop market for 2016-2020. [13]



Verification history