Â
Fabric and Databricks Platform
We did not use Photon acceleration in Azure Databricks as similar capabilities are not yet ready in Fabric Spark. This way we ensured that the comparison is fair.
The Crucible of Testing
Our testing was divided into two critical phases: 300GB TPC-H data conversion to Delta format and the execution of 22 comprehensive queries on the delta tables. This rigorous process, replicated across three distinct trials for each platform, was designed to capture a genuine reflection of real-world usage, inclusive of cluster startup times to mirror actual deployment scenarios.
Revelatory Findings
The results were illuminating. Microsoft Fabric consistently outperformed Azure Databricks in both conversion and query execution times, affirming its superior efficiency and speed. Furthermore, when considering the operational costs associated with each platform, Microsoft Fabric presents an indisputably more cost-effective solution without compromising on computational power. This, combined with Fabric’s suite of tools including Fabric KQL, Warehouses, Semantic Modeling, and OneLake, establishes Fabric as a go-to solution.
Results
From starting the cluster to the conversion of all tables, Databricks took 1137 seconds. In contrast, Microsoft Fabric completed this step in 1059 seconds, making Fabric 78 seconds faster than Databricks.
Moving on to executing 22 queries, Databricks took 1138 seconds for end-to-end execution, while Microsoft Fabric only took 1056 seconds. These results remained consistent across multiple runs, with Microsoft Fabric consistently outperforming Databricks. And the results were consistent when tested on a dataset internal to MandelBulb Technologies.
As previously mentioned, our focus was not just on performance but also on associated costs. Based on the assumption of jobs running 8 hours a day, 5 days a week, totalling 160 hours a month, Databricks in Central US would cost $2672. Even if you pick cost friendly Job Compute, the total cost would be $1872. This is just the Azure Databricks cost. For all the practical purposes you will need tools such as Azure Data Factory for data ingestion and orchestration, storage accounts and other services. In comparison, Fabric F-32 SKU costs $921. It is important to note that Fabric F-32 provides significantly more compute power and services. For an instance, MandelBulb Technologies helped a customer revolutionize the data platform with Microsoft Fabric by utilizing Fabric Spark, Fabric KQL, Data Pipelines and Power BI, running in production with F-32 capacity, serving hundreds of customers parallelly, querying billions of records with average response time under a second. Even if we double the capacity to Fabric F-64 (available as a free trial for all users), the cost would only be $1843, which is approximately 30% less than Databricks.
In practical terms, Microsoft Fabric platform surpasses Databricks in overall value. Fabric Spark’s performance is comparable to Databricks Spark, even when the differences in cluster startup time is discarded which is averaging 195 seconds for Databricks compared to 5 seconds for Fabric Spark.
Disclaimer
It’s important to note that the benchmarking and testing conducted in this analysis were carried out with the intention of mirroring real-world scenarios, akin to how any data engineer might operate. While we’ve strived to maintain objectivity and practicality throughout our methodology, it’s essential to understand that these results are not official benchmarks endorsed by Microsoft Fabric or Databricks. Instead, they represent our practical findings derived from rigorous testing and benchmarking procedures. As such, readers are encouraged to consider these insights as indicative of potential performance outcomes rather than definitive assessments. For further inquiries or detailed information about our testing process or platform details feel free to reach out to us at sales@mandelbulbtech.com
Â