Skip to content
View Nitinx12's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report Nitinx12

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Nitinx12/README.md

Nitin β€” Data Analyst to Data Engineer

Typing SVG


╔══════════════════════════════════════════════════════════════════════════════╗
β•‘                         SYSTEM PROFILE :: NITIN.EXE                          β•‘
╠══════════════════════════════════════════════════════════════════════════════╣
β•‘  ROLE       β”‚ Data Analyst β†’ Data Engineer (transition in progress)          β•‘
β•‘  EXP        β”‚ 1.5 years analytics + actively building DE stack               β•‘
β•‘  STACK      β”‚ SQL Β· Python (PySpark/Pandas) Β· Airflow Β· dbt Β· Power BI       β•‘
β•‘  PROJECTS   β”‚ E-Commerce ETL Β· Retail DW Β· Bike Store DB Β· Airflow+dbt       β•‘
β•‘  LOCATION   β”‚ Delhi, India  β”‚  Remote-ready  β”‚  Hybrid-open                  β•‘
β•‘  SEEKING    β”‚ DA & DE Roles β”‚ Real problems, not toy datasets                β•‘
β•‘  MOTTO      β”‚ "Not just dashboards β€” pipelines that actually run."           β•‘
╠══════════════════════════════════════════════════════════════════════════════╣
β•‘  DE PIPELINE  [β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘]  70%  β†’  Full Stack Engineer    β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
Β  Β  Β 


> CURRENT_MISSION.sh

#!/bin/bash
# What I'm building toward β€” no fluff

CURRENT_FOCUS="Production-grade ETL pipelines from scratch"
TARGET_STACK=("Airflow" "dbt" "PySpark" "Kafka" "AWS")
APPROACH="Learn by building real things, not just following tutorials"
OPEN_TO="Data Analyst & Data Engineer roles"

echo "[βœ”] Analyst foundations: SQL, Python, Power BI, EDA"
echo "[βœ”] Relational DB design: Hospital Mgmt + Bike Store DB shipped"
echo "[βœ”] Data warehousing: Retail DW β€” star schema, dimensional model"
echo "[βœ”] First pipeline shipped: E-Commerce Analytics (PostgreSQL β†’ DuckDB)"
echo "[⚑] Currently: Airflow orchestration + dbt medallion transforms"
echo "[⏳] Next: Kafka streaming + cloud deployment on AWS"
echo "[🎯] End goal: Full-stack data engineer who understands the business"


> ls ./tech_stack/

// CORE LANGUAGES

Python SQL

// DATA ENGINEERING β€” ACTIVE BUILD ZONE ⚑

Apache Airflow dbt Apache Kafka PySpark Docker

// ANALYTICS & BI

Power BI Tableau Pandas DuckDB NumPy

// DATABASES & INFRA

PostgreSQL SQLAlchemy GitHub Excel



> cat ./projects/featured.log


πŸ›’ End-to-End E-Commerce Sales Analytics

[STATUS: SHIPPED βœ”] Β |Β  View Repo β†’

PIPELINE ARCHITECTURE:
──────────────────────────────────────────────────────────────────
 Raw CSVs  ──►  PostgreSQL  ──►  Python/Pandas  ──►  DuckDB
   (src)       (ingestion)      (clean+dedupe)     (OLAP queries)
                                                         β”‚
                                                         β–Ό
                                                Seaborn + Matplotlib
                                                (visual storytelling)
──────────────────────────────────────────────────────────────────

What makes this real:

  • Handles inconsistent, dirty real-world CSVs β€” not clean Kaggle data
  • SQLAlchemy ORM for reliable, repeatable ingestion into PostgreSQL
  • DuckDB for fast in-process analytical queries (no spinning up a warehouse)
  • Full EDA with visual storytelling, not just charts for the sake of charts

PostgreSQL Python Pandas DuckDB SQLAlchemy Seaborn


πŸͺ Retail Data Warehouse

[STATUS: SHIPPED βœ”] Β |Β  View Repo β†’

WAREHOUSE ARCHITECTURE:
──────────────────────────────────────────────────────────────────
 Source Data  ──►  Staging Layer  ──►  Dimensional Model
  (raw retail)     (clean/conform)      (Star Schema)
                                              β”‚
                              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                              β–Ό               β–Ό               β–Ό
                          Fact Table    Dim: Product    Dim: Customer
                         (fact_sales)   Dim: Store      Dim: Date
──────────────────────────────────────────────────────────────────

What makes this real:

  • Star schema dimensional model built for analytical query performance
  • Proper fact and dimension table separation following Kimball methodology
  • Slowly Changing Dimensions (SCD) handling for historical accuracy
  • Designed to plug directly into BI tools like Power BI or Tableau

PostgreSQL SQL Python Power BI


🚲 Bike Store Relational Database

[STATUS: SHIPPED βœ”] Β |Β  View Repo β†’

SCHEMA DESIGN:
──────────────────────────────────────────────────────────────────
 Business Requirements  ──►  ERD Design  ──►  Normalized Schema
       (analysis)             (entities)        (3NF tables)
                                                     β”‚
                                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                                  β–Ό                  β–Ό
                             Stored Procs         Indexes
                             + Triggers          + Views
──────────────────────────────────────────────────────────────────

What makes this real:

  • Full relational design: customers, orders, products, staff, stores, inventory
  • Normalized to 3NF β€” no data anomalies, enforced referential integrity
  • Stored procedures and triggers for business logic at the DB layer
  • Window functions and JOIN-heavy queries for real analytical reporting

SQL SQL


⚑ Airflow + dbt Medallion ETL Pipeline (In Progress)

[STATUS: BUILDING πŸ”§]

PLANNED ARCHITECTURE:
──────────────────────────────────────────────────────────────────
 CSV / PostgreSQL  ──►  Airflow DAGs  ──►  dbt Transforms
      (sources)         (orchestrate)       (Bronze β†’ Silver β†’ Gold)
                                                    β”‚
                                                    β–Ό
                                            Power BI Dashboard
                                            (business insights)
──────────────────────────────────────────────────────────────────

Full medallion architecture. Airflow handling scheduling and orchestration, dbt handling all transformation logic with tests and documentation, Power BI at the output layer. Watch this repo.


πŸ“‘ Real-Time Streaming Pipeline (Planned)

[STATUS: QUEUED 🟑]

Kafka Producer β†’ Kafka Topics β†’ PySpark Streaming β†’ PostgreSQL / Delta Lake

Event-driven pipeline with real-time processing. The step that separates analysts from engineers.



> ./learning_tracker --verbose

╔══════════════════════════════════════════════════════════════════════════════╗
β•‘                        SKILL ACQUISITION LOG :: 2026                         β•‘
╠══════════════════════╦════════════════════════════╦══════════╦══════════════ β•£
β•‘ TRACK                β•‘ FOCUS AREAS                β•‘ PROGRESS β•‘ STATUS        β•‘
╠══════════════════════╬════════════════════════════╬══════════╬══════════════ β•£
β•‘ Data Engineering     β•‘ PySpark                    β•‘          β•‘               β•‘
β•‘                      β•‘ SQLAlchemy                 β•‘ β–“β–“β–“β–“β–“β–“β–‘β–‘ β•‘ ⚑ ACTIVE    β•‘
β•‘                      β•‘ Airflow                    β•‘   75%    β•‘               β•‘
β•‘                      β•‘ dbt                        β•‘          β•‘              β•‘
β•‘                      β•‘ Medallion Architecture     β•‘          β•‘              β•‘
╠══════════════════════╬════════════════════════════╬══════════╬══════════════╣
β•‘ Advanced Analytics   β•‘ Advanced SQL               β•‘          β•‘              β•‘
β•‘                      β•‘ Pandas                     β•‘ β–“β–“β–“β–“β–“β–“β–“β–“ β•‘ βœ… SOLID     β•‘
β•‘                      β•‘ Power BI                   β•‘   95%    β•‘              β•‘
β•‘                      β•‘ Tableau                    β•‘          β•‘              β•‘
β•‘                      β•‘ MySQL / PostgreSQL         β•‘          β•‘              β•‘
╠══════════════════════╬════════════════════════════╬══════════╬══════════════╣
β•‘ DB Design            β•‘ Dimensional Modeling       β•‘          β•‘              β•‘
β•‘                      β•‘ Star / Snowflake Schema    β•‘ β–“β–“β–“β–“β–“β–“β–‘β–‘ β•‘ βœ… SHIPPED   β•‘
β•‘                      β•‘ 3NF Normalization          β•‘   80%    β•‘              β•‘
β•‘                      β•‘ SCD / Slowly Changing Dims β•‘          β•‘              β•‘
╠══════════════════════╬════════════════════════════╬══════════╬══════════════╣
β•‘ Cloud Infrastructure β•‘ AWS (S3, Glue, Athena,     β•‘ β–“β–“β–‘β–‘β–‘β–‘β–‘β–‘ β•‘ 🟒 SCOUTING  β•‘
β•‘                      β•‘ Redshift, IAM, Lambda)     β•‘   25%    β•‘              β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•©β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•


> cat ./certifications.txt

Google Data Analytics Β  Microsoft PL-300 Β  HackerRank SQL Gold



> github --stats

Β Β 







> snake --generate

Snake animation


╔══════════════════════════════════════════════════════════════════════╗
β•‘                                                                      β•‘
β•‘   $ echo "Open to DA & DE roles. Let's build something real."       β•‘
β•‘                                                                      β•‘
β•‘   > Open to DA & DE roles. Let's build something real.              β•‘
β•‘                                                                      β•‘
β•‘   $ _                                                                β•‘
β•‘                                                                      β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

LinkedIn Β Β  Email

Popular repositories Loading

  1. Nitinx12 Nitinx12 Public

  2. sql-data-warehouse-project sql-data-warehouse-project Public

    Forked from DataWithBaraa/sql-data-warehouse-project

    A comprehensive guide to building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics.

    TSQL

  3. Databricks_Medallion_Warehouse Databricks_Medallion_Warehouse Public

    Jupyter Notebook

  4. Retail_data_warehouse Retail_data_warehouse Public

    Python

  5. Museum-Project Museum-Project Public

    Python

  6. Bike-Store-Relational-Database Bike-Store-Relational-Database Public

    PySpark incremental ETL pipeline from MongoDB to PostgreSQL, with SQL data quality tests, schema auto-evolution, and a 21-script analytics library for a retail store dataset.

    Python