π Data Scientist | Generative AI Practitioner | LLM/RAG Engineer
I build end-to-end AI systems that make real-world impactβranging from LLM fine-tuning and image classification apps, to retrieval-augmented generation (RAG) pipelines and AI-assisted job search agents. My work integrates machine learning, deep learning, and natural language processing (NLP) with cutting-edge tooling like OpenAI, HuggingFace, Streamlit, CrewAI, and ChromaDB.
- βοΈ Train and fine-tune LLMs for domain-specific tasks (e.g., sentiment analysis, resumes, instructions).
- π€ Develop computer vision applications using ResNet, VGG, and EfficientNet.
- π» Train and Deploy Supervised Machine Learning Build regression and predictive Machine learning models.
- π Build advanced RAG pipelines using ChromaDB, FAISS, and OpenAI APIs.
- π§ͺ Experiment with PEFT techniques (LoRA, QLoRA, IA3, DPO) on real-world datasets.
- π Design data science workflows: MLflow tracking, feature engineering, and model evaluation.
- π Deploy AI apps with Streamlit, FastAPI, Slack bots, and RESTful APIs.
- δ·’ Design and Build M&E Systems driven by results-based management approach, craft Theory of Change, Results Frameworks, M&E Plans, Analyze data, Visualize, build Dynamic Dashboards.
- π¬ Fine-tuning & evaluating LLMs on domain-specific sentiment and intent classification.
- π§± Implementing MLOps/LLMOps pipelines for scalable experimentation.
- π Improving fairness in ML models trained on imbalanced datasets.
- π§ Prompt engineering for grounded and hallucination-free AI output.
- π― Deploying AI apps with powerful frontends using Streamlit + LangChain + LlamaIndex.
Streamlit-powered GenAI App to retrieve and summarize research papers (PDFs) using a multi-vector retriever, ChromaDB, GPT-4o, and web-augmented generation.
PDF Ingestion β Chunking β Embedding β Retrieval β Generation β UI
Fine-tuned ResNet50 model (transfer learning) trained on 120 Stanford Dog Breeds with >80% validation accuracy. App UI built using Streamlit that predicts breed from uploaded
.jpeg/.pngimage.
Agentic Streamlit App powered by CrewAI + Open-source LLMs. Guides users in:
- Resume feedback.
- Tailored job openings.
- Cover letter generation.
- Interview Q&A.
Robust ML pipeline for highly imbalanced datasets, including:
- Stratified train/test splitting.
- Oversampling (SMOTE).
- GridSearch + XGBoost.
- ROC-AUC, confusion matrix.
NLP workflow with:
- SymSpell spell correction.
- Stratified cross-validation.
- Oversampling.
- Streamlit UI for public demo.
Regression pipeline using XGBoost, feature engineering, and marketplace data (brand, model, mileage, engine size, etc.).
Experiments with LoRA, QLoRA, IA3, and DPO on binary sentiment tasks using HuggingFace Transformers + bitsandbytes.
Custom CNN and pretrained ResNet models trained on:
- π£ Food101 (sushi, pizza, steak...).
- π Stanford Dog Breeds (with label mapping & confidence overlay).
- π¬ Multi-turn chatbot with memory + web search + RAG.
- π§βπΌ Job Application Assistant v2 (LangGraph-powered).
- π°οΈ LLM inference microservices (FastAPI + LangServe)
- 𧬠BGE-Large + Llama3 RAG for scientific documents
- π GitHub Projects
- πΌ LinkedIn Profile
- π§ Email: chonzadaniel@yahoo.com
Languages: Python, R, SQL, Markdown
Algorithms: LLMs, ML, NLP, Transformers/CNNs/ANN/RNNs/GANs, LSTMs
Frameworks & Tools: PyTorch, scikit-learn, Transformers, Streamlit, MLflow, FastAPI, LangChain, LlamaIndex, ChromaDB, OpenAI, HuggingFace, Plotly, Matplotlib, seaborn , crewai, crewai-tools, APIs
MLOps: MLflow, wandb, Docker, Conda, Git, Kaggle, AWS
Deployment: Huggingface Spaces, Streamlit Cloud, Slack, Local API, Render, AWS
IDEs and Editors: Jupyter, Google Colab, PyCharm, Visual Studio Code, Kaggle, Sublime Text, Thonny
βBuild. Evaluate. Iterate. Deploy. Share.β
Letβs collaborate on AI that matters. Feel free to explore my work or reach out!
