Fast, easy, and collaborative
Over the past five years, Apache Spark has emerged as the open source standard for advanced analytics, machine learning, and AI on Big Data. With a massive community of over 1,000 contributors and rapid adoption by enterprises, Spark’s popularity continues to rise.
Azure Databricks is designed in collaboration with Databricks whose founders started the Spark research project at UC Berkeley, which later became Apache Spark. The goal of Azure Databricks is to help customers accelerate innovation and simplify the process of building Big Data & AI solutions by combining the best of both, Databricks and . Azure
Azure Databricks has three design principles.
First, enhance user productivity in developing Big Data applications and analytics pipelines. Azure Databricks’ interactive notebooks enable data science teams to collaborate using popular languages such as R, Python, Scala, and SQL and create powerful machine learning models by working on all their data, not just a sample data set. Native integration with Azure services further simplifies the creation of end-to-end solutions. These capabilities have enabled companies to boost the productivity of their data science teams by over 50 percent.
Second, enable customers to scale globally without limits by working on big data with a fully managed, cloud-native service that automatically scales to meet their needs, without high cost or complexity. Azure Databricks not only provides an optimized Spark platform, which is much faster than vanilla Spark, but it also simplifies the process of building batch and streaming data pipelines and deploying machine learning models at scale.
Third, ensure that customers get the enterprise security and compliance they have come to expect from Azure. Azure Databricks protects customer data with enterprise-grade SLAs, simplified security and identity, and role-based access controls with Azure Active Directory integration. As a result, organizations can safeguard their data without compromising productivity of their users.
Azure is the best place for Big Data & AI
Azure Databricks has been added to the Azure portfolio of data services andis highly integrated with other Azure services to unlock key customers scenarios.
High-performance connectivity to , a petabyte scale, and elastic cloud data warehouse allows organizations to build Modern Data Warehouses to load and process any type of data at scale for enterprise reporting and visualization with Azure SQL Data Warehouse . It also enables data science teams working in Azure Databricks notebooks to easily access high-value data from the warehouse to develop models. Power BI
Integration with , Azure IoT Hub , and Azure Event Hubs clusters enables enterprises to build scalable streaming solutions for real-time analytics scenarios such as recommendation engines, fraud detection, predictive maintenance, and many others. Azure HDInsight Kafka
Integration with , Azure Blob Storage , Azure Data Lake Store , and Azure SQL Data Warehouse allows organizations to use Azure Databricks to clean, join, and aggregate data no matter where it sits. Azure Cosmos DB
With Azure Databricks and its native integration with other services, Azure is the one-stop destination to easily unlock powerful new analytics, machine learning, and AI scenarios.