A compact Retrieval-Augmented Generation (RAG) API on Cloudflare Workers (Python) that:
- embeds questions with Workers AI,
- searches a Vectorize index of course data,
- and generates concise answers with a chat model.
It’s designed to power a simple frontend chat with strict CORS, clear JSON errors, and environment-driven config for staging/production.
What it does
-
Ingests a
courses.csv, generates embeddings with Workers AI, and writes strict NDJSON lines ({ id, values, metadata }) for bulk insert. -
Uploads those vectors into a Cloudflare Vectorize index.
-
Exposes an API:
GET /health– livenessGET /version– app version (from env)POST /ask– RAG pipeline →{ "answer": "..." }
git clone https://github.com/vishal-codes/course-hero-rag-bot
cd course-hero-rag-bot
pip install -r requirements.txt
Create .dev.vars (or .env) with your Cloudflare account credentials:
CLOUDFLARE_ACCOUNT_ID=acc_XXXXXXXXXXXXXXXXXXXXXXXXX
CLOUDFLARE_API_TOKEN=cf_api_token_with_ai_vectorize_permissions
# Optional app vars (entry.py will read these)
ALLOWED_ORIGINS=http://localhost:3000,your frontend URL
APP_VERSION=2025.09.02-1
DEBUG=1
CF_EMBED_MODEL=@cf/baai/bge-base-en-v1.5
CF_CHAT_MODEL=@cf/meta/llama-3.1-8b-instruct-fast
CF_TOPK=5
GEN_TEMPERATURE=0.2
GEN_MAX_TOKENS=350Create a wrangler.jsonc file in root directory
Important: envs do not inherit. Repeat
ai,vectorize, andvarsunderenv.staging.
data_loader/
└─ vector_builder.py
Build NDJSON (embeds with Workers AI) and optionally upload:
# Build + upload to Vectorize in one step
python data_loader/vector_builder.py \
--csv data_loader/courses.csv \
--out data_loader/vectors_to_upload.ndjson \
--index csuf-courses \
--insert \
--env ./.dev.varsVerify vectors exist:
npx wrangler vectorize list-vectors csuf-courses --limit 5Start dev server using staging environment and bind Vectorize to prod:
npx wrangler dev --env staging --experimental-vectorize-bind-to-prod
# → Ready on http://localhost:8787Quick checks (in another terminal):
curl -s http://localhost:8787/health
curl -s http://localhost:8787/version
curl -i -X OPTIONS "http://localhost:8787/ask" \
-H "Origin: http://localhost:3000" \
-H "Access-Control-Request-Method: POST" \
-H "Access-Control-Request-Headers: content-type"
curl -i -X POST "http://localhost:8787/ask" \
-H "Origin: http://localhost:3000" \
-H "Content-Type: application/json" \
--data '{"question":"How imp is this 131 course? is it a prereq for any other course?"}'Logs:
npx wrangler tail --env stagingStaging
npx wrangler deploy --env stagingProduction
npx wrangler deploy-
Workers (Python): Serverless runtime at the edge. Your API lives here (
src/entry.py). -
Workers AI: Hosted inference endpoints for embeddings & LLMs (
env.AI.run(model, inputs)). -
Vectorize: Managed vector database. You insert NDJSON vectors and query via bindings (
env.COURSES.query(...)). -
Bindings: How a Worker accesses platform resources:
aibinding →env.AIvectorizebinding →env.COURSES
-
Environments:
staging/productionblocks inwrangler.jsoncwith separate vars/bindings. -
Compatibility Date: Locks runtime features for deterministic behavior.
-
data_loader/vector_builder.py- Loads
courses.csv→ cleans text & metadata. - Embeds course “documents” in batches with Workers AI.
- Writes strict NDJSON lines:
{"id": "...","values":[...768 floats...],"metadata":{...}} - Optional:
--insertPOSTs NDJSON to Vectorize (/vectorize/v2/indexes/<index>/insert).
- Loads
-
src/entry.py(Worker)-
CORS: Handles
OPTIONSpreflight; restricts toALLOWED_ORIGINS. -
POST /ask:-
Embed question with Workers AI (
CF_EMBED_MODEL). -
Query Vectorize (
env.COURSES.query) for top-K matches (returning metadata). -
Build a context block from matches.
-
Generate answer with chat model (
CF_CHAT_MODEL), strip bracketed refs, return:{ "answer": "...", "sources": [] }
-
-
Errors are returned as JSON with appropriate status codes (400/403/502/500).
-
rag_answer.py&example_query.pyshow the same flow using REST (useful for sanity checks).
Request
POST /ask
Content-Type: application/json
{
"question": "how imp is this 131 course? is it a prereq for any other course?
?",
"topK": 5 // optional, 1..10 (default from env)
}
Response
{
"answer": "The CPSC 131 course is a fundamental course in the Computer Science program at CSU Fullerton. It is a prerequisite for several courses, including Compilers and Languages, File Structures and Database Systems, and likely others. Its importance lies in providing a solid foundation in programming concepts and principles. As a prerequisite, it is essential for students to have a strong understanding of programming before taking these courses. Without this course, students may struggle to keep up with the material in the subsequent courses.",
"sources": [
{
"id": "CPSC_323_Mohamadreza_Ahmadnia_42",
"score": 0.688,
"course": "CPSC 323",
"courseName": "Compilers and Languages",
"instructor": "Mohamadreza Ahmadnia"
},
{
"id": "CPSC_323_Shohrat_Geldiyev_55",
"score": 0.68,
"course": "CPSC 323",
"courseName": "Compilers and Languages",
"instructor": "Shohrat Geldiyev"
},
{
"id": "CPSC_332_Shawn_Wang_54",
"score": 0.675,
"course": "CPSC 332",
"courseName": "File Structures and Database Systems",
"instructor": "Shawn Wang"
},
{
"id": "CPSC_332_David_Heckathorn_10",
"score": 0.669,
"course": "CPSC 332",
"courseName": "File Structures and Database Systems",
"instructor": "David Heckathorn"
},
{
"id": "CPSC_323_Song_Choi_58",
"score": 0.667,
"course": "CPSC 323",
"courseName": "Compilers and Languages",
"instructor": "Song Choi"
}
]
}
{ "$schema": "node_modules/wrangler/config-schema.json", "name": "python-rag-bot", "main": "src/entry.py", "compatibility_date": "2025-09-02", "compatibility_flags": ["python_workers"], "observability": { "enabled": true }, // PROD (top-level) "ai": { "binding": "AI" }, "vectorize": [{ "binding": "COURSES", "index_name": "csuf-courses" }], "vars": { "ALLOWED_ORIGINS": "your-frontend-origin", "DEBUG": "0", "APP_VERSION": "2025.09.02-1", "CF_EMBED_MODEL": "@cf/baai/bge-base-en-v1.5", "CF_CHAT_MODEL": "@cf/meta/llama-3.1-8b-instruct-fast", "CF_TOPK": "5", "GEN_TEMPERATURE": "0.2", "GEN_MAX_TOKENS": "350" }, // STAGING "env": { "staging": { "ai": { "binding": "AI" }, "vectorize": [{ "binding": "COURSES", "index_name": "csuf-courses" }], "vars": { "ALLOWED_ORIGINS": "http://localhost:3000,your-frontend-origin", "DEBUG": "1", "APP_VERSION": "2025.09.02-1", "CF_EMBED_MODEL": "@cf/baai/bge-base-en-v1.5", "CF_CHAT_MODEL": "@cf/meta/llama-3.1-8b-instruct-fast", "CF_TOPK": "5", "GEN_TEMPERATURE": "0.2", "GEN_MAX_TOKENS": "350" } } } }