KM
Khaja Mujahiddin Mohammed

Hi, I'm Khaja Mujahiddin Mohammed

Data Scientist & Cloud Engineer — NLP • Computer Vision • MLOps • AWS

NLPDeep LearningMLData Viz
3+ years • Data Science & Cloud85% accuracy • Multimodal NLP30% faster ETL • AWS + Airflow20–30% faster decisions • Power BI
3+ yrsExperience
85%Fake News Acc.
-30%ETL Time
20–30%Ops Speed

About

Accomplished Data Scientist & Cloud Engineer with 3+ years of experience designing and deploying end-to-end data workflows, machine learning models, and cloud-based analytics solutions. Specialized in NLP, Computer Vision, MLOps, and real-time data engineering pipelines using AWS, Python, and Spark. Proven track record: automated 80% of manual tasks, improved forecast accuracy by 15%, and reduced decision-making time by 30%.

Core Strengths

  • End-to-end ML & MLOps on AWS
  • ETL/Data Engineering with Airflow & Spark
  • Interactive BI: Power BI & Tableau
  • Agile collaboration & stakeholder alignment

Domains

  • Natural Language Processing (NLP)
  • Computer Vision (Segmentation, Detection)
  • Forecasting & Predictive Analytics
  • Real-time Data Pipelines

Live Snippet

# Airflow DAG: simple daily retraining job
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime
def retrain():
    # load data, retrain model, log metrics
    pass
with DAG("daily_retrain", start_date=datetime(2025,1,1), schedule="@daily", catchup=False) as dag:
    PythonOperator(task_id="retrain_model", python_callable=retrain)

Projects

Representative image for Fake News Detection (Multimodal ML) (Unsplash)

Fake News Detection (Multimodal ML)

NLP · MLOps · TensorFlow · Hugging Face · Airflow

Representative image for Self-Driving Car: Pedestrian & Cyclist Segmentation (Unsplash)

Self-Driving Car: Pedestrian & Cyclist Segmentation

Deep Learning · Computer Vision · U-Net · OpenCV

Representative image for E-Commerce Sales Dashboard (Unsplash)

E-Commerce Sales Dashboard

Power BI · Analytics · DAX · Data Modeling

Representative image for Customer Segmentation (K-Means + PCA) (Unsplash)

Customer Segmentation (K-Means + PCA)

Machine Learning · K-Means · PCA · Power BI

Representative image for Street View House Numbers (SVHN) Classification (Unsplash)

Street View House Numbers (SVHN) Classification

Deep Learning · CNN · TensorFlow · Image Processing

Representative image for Cloud-based ETL Pipeline Optimization (Unsplash)

Cloud-based ETL Pipeline Optimization

AWS · ETL · Airflow · Lambda · Redshift · EMR

Experience

  1. AWS Cloud Practitioner — EduBridge Learning Pvt. Ltd.

    Feb 2023 — May 2023 · Remote

    • Improved data processing efficiency by 25% through automated ETL pipelines in Python.
    • Leveraged AWS EC2, S3, RDS, Redshift to optimize workflows (reduced system load by 20%).
    • Integrated Apache Airflow & dbt for scheduled, scalable data transformations.
    • Designed cloud-native architectures supporting large-scale analytics on AWS.
    • Performed AWS cost optimization, saving ~15% via rightsizing & utilization monitoring.
  2. Power BI Developer — SRIK Consulting Services Pvt. Ltd.

    Sep 2020 — Feb 2023 · Hyderabad, India

    • Built 20+ interactive Power BI dashboards integrated with MySQL & Excel.
    • Automated pipelines with SQL & Power Query, reducing manual errors.
    • Designed KPI dashboards improving visibility by 30% & decision speed.
    • Developed advanced DAX measures for time-series & custom metrics.
    • Standardized data models and documentation for reusability across teams.
  3. Data Analytics Intern — ExcelR Solutions

    Oct 2022 — Jan 2023 · Remote

    • Designed Power BI/Tableau dashboards for KPI tracking & trends.
    • Improved extraction accuracy by 30% via optimized SQL queries.
    • Reduced reporting time by 25% through statistical & visual analyses.
    • Built interactive reports to communicate insights to stakeholders.
    • Collaborated in Agile sprints; presented findings to cross-functional teams.

Skills

  • Languages: Python, SQL, Java (basic), JavaScript (basic)
  • Core Python Stack: Pandas, NumPy, Matplotlib, Seaborn, Jupyter, Google Colab
  • ML / DL: TensorFlow, PyTorch, Scikit-learn, Hugging Face Transformers, CNN, U-Net, NLP, Computer Vision, Model Evaluation
  • Data Engineering: ETL Pipelines, Apache Airflow, dbt, Apache Spark, Hadoop, Data Warehousing, Streaming, Data Modeling, MLflow
  • Cloud: AWS (EC2, S3, RDS, Redshift, EMR, Lambda, SageMaker, IAM), Snowflake
  • DevOps: Docker, Kubernetes, Terraform, Git, GitHub, GitLab, CI/CD
  • BI & Analytics: Power BI (DAX, Power Query, KPI Dashboards), Tableau, Advanced Excel
  • Databases: MySQL, PostgreSQL, MongoDB, PL/SQL
  • Other: OpenCV, Power Query, ROI Analysis, Version Control

Education

Master’s in Data Science

University of New Haven, CT, USA · Aug 2023 — May 2025

  • Coursework: Advanced Machine Learning, Deep Learning, Natural Language Processing, Cloud-Based MLOps, Big Data Analytics.
  • Labs & Tools: Spark-based ETL on AWS EMR; automated model retraining with SageMaker Pipelines.
  • Key Projects: Multimodal Fake News Detection (85% accuracy); Pedestrian & Cyclist Segmentation (U-Net).

Certifications

  • AWS Cloud Practitioner — EduBridge
  • Google Data Analytics — Coursera
  • HackerRank SQL — 5 Star
  • MySQL Developer — Udemy
  • PGDCA (Post Graduate Diploma in Computer Applications)
  • Data Analytics & Visualization Virtual Experience — Forage

Contact

The form opens your mail client with a prefilled email.