// SNEHAL.INIT()

SNEHAL GALANDE

Data Engineer | Cloud and big Data Specialist | 2.6+ YOE |

snehal@portfolio: ~

snehal@portfolio:~$ ./execute_intro.sh

Loading modules... [OK]

Connecting to data warehouse... [OK]

Hello, world! I engineer data ecosystems.

snehal@portfolio:~$ _

Explore Pipelines

01. Who Am I?

I'm Snehal Galande, a Computer Engineer with a passion for Data Engineering, Big Data, Cloud Technologies, and Analytics development. I hold a Bachelor's degree in Computer Engineering from Savitribai Phule Pune University with a SGPA of 9.51/10.0.

With 2.6+ years of hands-on project experience, I specialize in building scalable ETL pipelines, cloud-native data lakehouse architectures, distributed data processing systems, and analytics platforms using PySpark, SQL, Databricks, AWS, Azure , Delta Lake, and Apache Airflow. My expertise spans data pipeline development, data transformation, cloud data engineering, and analytics-driven solutions

Do you want to work together? Please reach out to me by e-mail and slide down to contact

10TB+

Data Processed

99.9%

Pipeline Uptime

02. Tech Stack

Core Languages

Python SQL

Big Data Processing

Pyspark Databricks

Cloud Platforms

AWS Microsoft Azure

Data Warehouses & Orchestration

Amazon Redshift Azure Synapse Apache Airflow

Streaming Technologies

Apache Kafka Amazon Kinesis

DevOps & Practices

Git CI/CD Linux

03. Featured Pipelines

Cloud-Native Lakehouse Pipeline

Designed and built a scalable cloud-native data lakehouse architecture using PySpark, Databricks, Delta Lake, and AWS S3. Implemented Medallion Architecture, incremental ETL processing, CDC pipelines, and Airflow orchestration for enterprise-scale analytics workloads.

PySpark Databricks AWS S3 Delta Lake Apache Airflow Redshift ETL CDC Medallion Architecture

Enterprise Knowledge Data Platform

Built an enterprise-scale data platform for centralized analytics and processing of structured and semi-structured datasets. Developed scalable ETL workflows, optimized Spark transformations, and cloud-native data pipelines to improve reliability and downstream reporting accessibility.

PySpark Databricks AWS SQL ETL Pipelines Data Processing Data Validation Cloud Architecture

Sales Analytics Repository Platform

Developed a centralized sales analytics repository enabling KPI tracking, reporting, and business intelligence workflows in databricks. Created optimized SQL transformations and reporting-ready datasets to improve data accessibility and decision-making efficiency.

Python SQL Data Modelling

04. Certifications

SQL (Advanced) Certificate

HackerRank

Verify

Build Data Pipelines with Lakeflow Spark

Databricks

Verify

Databricks Fundamentals

Databricks

Verify

Generative AI Fundamentals

Databricks

Verify

Reinvention with Agentic AI

Accenture (Credly)

Verify

05. Init Connection

I'm currently open for new opportunities to build scalable systems. Whether you have a question or just want to say hi, my inbox is always open!

Ping Me