Mage ? Is it a futuristic data pipeline tool ?
Everyone is talking about Mage. Is Mage going to be a game changer in the data pipeline development and orchestration area?
I’m sharing my view and experience so-far with Mage.
How Mage is making difference in the UI/UX level?
Mage is bundled with a rich UI with an end to end functionality control experience when it comes to data pipeline. Starting from development of a pipeline till the launch of the pipeline in the production, do your entire workflow through Mage with the integration of version control tools.
How Mage is making difference in development experience?
Mage is having a very user friendly interactive notebook UI for development. We can write codes in Python, R and SQL using Mage. Write code with less exception handling, Mage does majority of the heavy lifting in the exception handling. Create the relations with each blocks of the pipeline through the UI itself, rather than write additional code for the block dependancy in the pipeline. Mage helps us to preview the result of the pipeline we are developing. No need to wait for us to get to pipeline deployed in the orchestration layer to preview/verify the result. Result preview option gives the capability to visualise the data too.
What are the types of data pipeline integration Mage giving?
Currently Mage supports Data Integration , Standard batch and streaming pipelines.
Data Integration, Mage supports different types of third party sources (A source can be a 3rd party API, SaaS, database, data warehouse, or a data lake) connectors for the data integration. Mage uses the data engineering community standard for data integrations called the Singer spec. In addition, Mage further standardises the spec and provides common classes and methods to make implementing them easier and faster.
Standard Batch, Mage supports writing our own code based standard batch pipelines. We can write our codes in Python, R and SQL.
Streaming, Mage currently supports streaming pipeline for Kafka and Azure events hub.
Mage is having built in integration with dbt, so you can write dbt codes using mage and deploy along with the complete pipeline itself.
How Mage is making difference in Pipeline Management?
Mage is giving complete end to end orchestration platform for pipeline management. We can deploy the pipeline, schedule, monitor and alert if anything. The pipeline gives observability of the pipeline with proper logging and with a user friendly way. It’s really easy to debug a pipeline failure, because Mage gives the pipeline logging in block level with more granularity.
Summary
In my experience Mage is a go to tool for the team, they are looking for a single tool for building and orchestrating pipelines. It’s giving a user friendly UI platform with great developer and orchestration experience. It has many in-built aggregation options and transformations, most of the data engineers/analysts looking for. I also really love the way it gives the pipeline run details, its logging and time lapse graphical representation of each block in the pipeline. In my opinion Mage is going to be the clear winner in the Data Pipeline development and orchestration area. Last but not least, Mage is having a great community and the creators of Mage are very clear about their vision for this Open Source Product.
I will be sharing more detailed blog very soon, on the real implementation of Mage in the data pipeline integrations along with deployment and integration steps.
Mage Github : https://github.com/mage-ai/mage-ai