Skip to content
View chonzadaniel's full-sized avatar

Block or report chonzadaniel

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
chonzadaniel/README.md

πŸ‘‹ Hi there, I'm Emmanuel Daniel Chonza

πŸš€ Data Scientist | Generative AI Practitioner | LLM/RAG Engineer

I build end-to-end AI systems that make real-world impactβ€”ranging from LLM fine-tuning and image classification apps, to retrieval-augmented generation (RAG) pipelines and AI-assisted job search agents. My work integrates machine learning, deep learning, and natural language processing (NLP) with cutting-edge tooling like OpenAI, HuggingFace, Streamlit, CrewAI, and ChromaDB.


🧠 What I Do

  • βš™οΈ Train and fine-tune LLMs for domain-specific tasks (e.g., sentiment analysis, resumes, instructions).
  • πŸ€– Develop computer vision applications using ResNet, VGG, and EfficientNet.
  • πŸ’» Train and Deploy Supervised Machine Learning Build regression and predictive Machine learning models.
  • πŸ” Build advanced RAG pipelines using ChromaDB, FAISS, and OpenAI APIs.
  • πŸ§ͺ Experiment with PEFT techniques (LoRA, QLoRA, IA3, DPO) on real-world datasets.
  • πŸ“Š Design data science workflows: MLflow tracking, feature engineering, and model evaluation.
  • 🌐 Deploy AI apps with Streamlit, FastAPI, Slack bots, and RESTful APIs.
  • δ·’ Design and Build M&E Systems driven by results-based management approach, craft Theory of Change, Results Frameworks, M&E Plans, Analyze data, Visualize, build Dynamic Dashboards.

πŸ”­ Current Work

  • πŸ”¬ Fine-tuning & evaluating LLMs on domain-specific sentiment and intent classification.
  • 🧱 Implementing MLOps/LLMOps pipelines for scalable experimentation.
  • πŸ“ˆ Improving fairness in ML models trained on imbalanced datasets.
  • 🧠 Prompt engineering for grounded and hallucination-free AI output.
  • 🎯 Deploying AI apps with powerful frontends using Streamlit + LangChain + LlamaIndex.

🧩 Featured Projects

Streamlit-powered GenAI App to retrieve and summarize research papers (PDFs) using a multi-vector retriever, ChromaDB, GPT-4o, and web-augmented generation.
PDF Ingestion β†’ Chunking β†’ Embedding β†’ Retrieval β†’ Generation β†’ UI


Fine-tuned ResNet50 model (transfer learning) trained on 120 Stanford Dog Breeds with >80% validation accuracy. App UI built using Streamlit that predicts breed from uploaded .jpeg/.png image.


πŸ’Ό [Resume & Job Application Advisor]

Agentic Streamlit App powered by CrewAI + Open-source LLMs. Guides users in:

  • Resume feedback.
  • Tailored job openings.
  • Cover letter generation.
  • Interview Q&A.

πŸ’³ [Credit Card Fraud Detector]

Robust ML pipeline for highly imbalanced datasets, including:

  • Stratified train/test splitting.
  • Oversampling (SMOTE).
  • GridSearch + XGBoost.
  • ROC-AUC, confusion matrix.

🐦 [Racist Tweet Classifier]

NLP workflow with:

  • SymSpell spell correction.
  • Stratified cross-validation.
  • Oversampling.
  • Streamlit UI for public demo.

πŸš— [Used Car Price Prediction]

Regression pipeline using XGBoost, feature engineering, and marketplace data (brand, model, mileage, engine size, etc.).


πŸ§ͺ [Parameter Efficient Fine-Tuning (PEFT)]

Experiments with LoRA, QLoRA, IA3, and DPO on binary sentiment tasks using HuggingFace Transformers + bitsandbytes.


πŸ–ΌοΈ [FoodVision & DogVision]

Custom CNN and pretrained ResNet models trained on:

  • 🍣 Food101 (sushi, pizza, steak...).
  • πŸ• Stanford Dog Breeds (with label mapping & confidence overlay).

πŸ“¦ Coming Soon

  • πŸ’¬ Multi-turn chatbot with memory + web search + RAG.
  • πŸ§‘β€πŸ’Ό Job Application Assistant v2 (LangGraph-powered).
  • πŸ›°οΈ LLM inference microservices (FastAPI + LangServe)
  • 🧬 BGE-Large + Llama3 RAG for scientific documents

πŸ“« Reach Me


πŸ› οΈ Tech Stack

Languages: Python, R, SQL, Markdown

Algorithms: LLMs, ML, NLP, Transformers/CNNs/ANN/RNNs/GANs, LSTMs

Frameworks & Tools: PyTorch, scikit-learn, Transformers, Streamlit, MLflow, FastAPI, LangChain, LlamaIndex, ChromaDB, OpenAI, HuggingFace, Plotly, Matplotlib, seaborn , crewai, crewai-tools, APIs

MLOps: MLflow, wandb, Docker, Conda, Git, Kaggle, AWS

Deployment: Huggingface Spaces, Streamlit Cloud, Slack, Local API, Render, AWS

IDEs and Editors: Jupyter, Google Colab, PyCharm, Visual Studio Code, Kaggle, Sublime Text, Thonny


✨ Motto

β€œBuild. Evaluate. Iterate. Deploy. Share.”

Let’s collaborate on AI that matters. Feel free to explore my work or reach out!

Popular repositories Loading

  1. ChatGPT-repository ChatGPT-repository Public

  2. ChatGPT ChatGPT Public

  3. notebook notebook Public

    Forked from jupyter/notebook

    Jupyter Interactive Notebook

    Jupyter Notebook

  4. MLproject MLproject Public

    Project Coding

    Jupyter Notebook

  5. khu-FinalProject khu-FinalProject Public

    Jupyter Notebook

  6. Credit-card-FraudDetection Credit-card-FraudDetection Public

    Submission of Project

    Jupyter Notebook