// SNEHAL.INIT()

SNEHAL GALANDE

Data Engineer | Cloud and big Data Specialist | 2.6+ YOE |

snehal@portfolio: ~

snehal@portfolio:~$ ./execute_intro.sh

Loading modules... [OK]

Connecting to data warehouse... [OK]

Hello, world! I engineer data ecosystems.

snehal@portfolio:~$ _

01. Who Am I?

I'm Snehal Galande, a Computer Engineer with a passion for Data Engineering, Big Data, Cloud Technologies, and Analytics development. I hold a Bachelor's degree in Computer Engineering from Savitribai Phule Pune University with a SGPA of 9.51/10.0.

With 2.6+ years of hands-on project experience, I specialize in building scalable ETL pipelines, cloud-native data lakehouse architectures, distributed data processing systems, and analytics platforms using PySpark, SQL, Databricks, AWS, Azure , Delta Lake, and Apache Airflow. My expertise spans data pipeline development, data transformation, cloud data engineering, and analytics-driven solutions

Do you want to work together? Please reach out to me by e-mail and slide down to contact

10TB+

Data Processed

99.9%

Pipeline Uptime

02. Tech Stack

Core Languages

Python SQL

Big Data Processing

Pyspark Databricks

Cloud Platforms

AWS Microsoft Azure

Data Warehouses & Orchestration

Amazon Redshift Azure Synapse Apache Airflow

Streaming Technologies

Apache Kafka Amazon Kinesis

DevOps & Practices

Git CI/CD Linux

03. Featured Pipelines

Cloud-Native Lakehouse Pipeline

Designed and built a scalable cloud-native data lakehouse architecture using PySpark, Databricks, Delta Lake, and AWS S3. Implemented Medallion Architecture, incremental ETL processing, CDC pipelines, and Airflow orchestration for enterprise-scale analytics workloads.

PySpark Databricks AWS S3 Delta Lake Apache Airflow Redshift ETL CDC Medallion Architecture

Enterprise Knowledge Data Platform

Built an enterprise-scale data platform for centralized analytics and processing of structured and semi-structured datasets. Developed scalable ETL workflows, optimized Spark transformations, and cloud-native data pipelines to improve reliability and downstream reporting accessibility.

PySpark Databricks AWS SQL ETL Pipelines Data Processing Data Validation Cloud Architecture

Sales Analytics Repository Platform

Developed a centralized sales analytics repository enabling KPI tracking, reporting, and business intelligence workflows in databricks. Created optimized SQL transformations and reporting-ready datasets to improve data accessibility and decision-making efficiency.

Python SQL Data Modelling

04. Certifications

05. Init Connection

I'm currently open for new opportunities to build scalable systems. Whether you have a question or just want to say hi, my inbox is always open!

Ping Me