Sayyed Farrukh Mehmood sfarrukhm

Hey there 👋

I'm Farrukh, an ML engineer who enjoys building production grade ML systems and squeezing models into places they probably shouldn't fit.

I work primarily with PyTorch, TensorFlow, and Hugging Face Transformers, focusing on model optimization, deployment, and efficient AI systems. My background is in mechanical engineering, but I’ve spent the past year designing and deploying ML pipelines—turns out optimizing fluid flow equations isn’t that different from optimizing neural networks.

What I'm working on

Right now, I’m focusing on projects that make ML systems leaner and easier to ship:

Production Inference API – Built a DistilBERT service on AWS EC2 using FastAPI. Quantized the model to reduce size and latency by half, added CI/CD with GitHub Actions, and optimized request handling for stability under load. The goal is to understand what it takes to keep ML systems reliable in production.

Model Efficiency Research – Experimenting with model compression and quantization pipelines for edge and low-latency deployments. I’m especially interested in the trade-offs between model size, speed, and interpretability.

Recent work

I was part of the UraanAI Techathon 2025, where our team built an integrated AI framework for manufacturing—computer vision for defect detection (99.6% accuracy), BiLSTM-GRU for predictive maintenance, and LightGBM for demand forecasting. The focus was deployment under real industrial constraints—limited compute, bandwidth, and cost.

I’ve also worked on model compression, taking a ResNet-based model from 45M parameters down to 180K (99.6% smaller) through knowledge distillation while keeping 94% accuracy. That 4× speedup made real-time inference viable on resource-limited hardware.

🛠️ Tech Stack

ML & Deep Learning

Efficient AI / Optimization

MLOps & Deployment

Languages & Tools

📘 Project Portfolio (Detailed Overview)

Project	Description	Tech Stack	Highlights
PakIndustry-4.0	Integrated AI system for manufacturing — computer vision, predictive maintenance, and demand forecasting	PyTorch • LightGBM • FastAPI	99.6% defect detection • Predictive RUL (MAE = 13.4) • Edge deployment
Sentiment-MLOps	Production-ready DistilBERT inference API on AWS	Hugging Face • FastAPI • Docker • AWS • GitHub Actions	Quantized model (−50% size/latency) • CI/CD pipeline
Model-Compression	Knowledge distillation and quantization pipeline for compact deep learning models	PyTorch • ONNX • NumPy	99.6% parameter reduction (45M → 180K) • 4× faster inference

📫 Connect With Me

📧 smfarrukhm@gmail.com • 💼 LinkedIn
💡 Open to ML engineering opportunities

Provide feedback

Saved searches

Use saved searches to filter your results more quickly