Lead Scoring Backend (FastAPI + LangChain + LangGraph + uv)
Implements a backend service to score leads' buying intent (High / Medium / Low) using a combination of rule-based logic and AI reasoning, orchestrated with LangChain and LangGraph, and served with the high-performance uv
ASGI server.
Features
- Upload a Product/Offer JSON
- Upload Leads via CSV
- Score using a rule layer (0–50) + AI layer (0–50)
- Retrieve results (JSON) or export CSV
- Interactive API docs at /docs
Project Structure
- app/models.py: Pydantic models, in-memory storage
- app/scoring_logic.py: Deterministic rule-based scoring
- app/llm_chain.py: LangChain prompt + Gemini model + response parsing
- app/scoring_workflow.py: LangGraph workflow orchestrating steps
- app/main.py: FastAPI app and endpoints
Setup
Prereqs:
- Python 3.12+
- uv tool (for local dev) or Docker (for containerized run)
- Create a
.env
in repo root with your Gemini API key:
GEMINI_API_KEY=your_key_here
- Install dependencies for local dev (using uv):
uv pip install -e .
Run
Launch the API server (hot reload):
uv run app.main:app --reload --port 8000
Then open http://localhost:8000/docs
Endpoints (Quick Reference)
- POST /offer: Upload product/offer JSON
- POST /leads/upload: Upload leads CSV (columns: name, role, company, industry, location, linkedin_bio)
- POST /score: Execute scoring workflow for uploaded leads
- GET /results: Get scored leads as JSON
- GET /results/csv: Download scored leads as CSV
Notes
- For assignment scope, data is stored in-memory and resets when the server restarts.
- Rule layer: role relevance, ICP overlap, completeness (max 50).
- AI layer: Gemini classifies intent + reasoning; intent mapped to points (High=50, Medium=30, Low=10).
API usage examples (cURL)
Assumes server running at http://localhost:8000
- Upload Offer
curl -X POST http://localhost:8000/offer \
-H 'Content-Type: application/json' \
-d '{
"name": "Acme Analytics",
"value_props": ["Faster dashboards", "Cost-effective"],
"ideal_use_cases": ["SaaS analytics", "data-driven marketing teams"]
}'
- Upload Leads (CSV)
CSV must have columns: name, role, company, industry, location, linkedin_bio.
curl -X POST http://localhost:8000/leads/upload \
-H 'Content-Type: multipart/form-data' \
-F 'file=@/path/to/leads.csv'
- Run Scoring
curl -X POST http://localhost:8000/score
- Get Results (JSON)
curl http://localhost:8000/results
- Download Results (CSV)
curl -L -o scored_leads.csv http://localhost:8000/results/csv
Rule logic explained
The deterministic rule score (0–50), implemented in app/scoring_logic.py
, comprises:
- Role relevance (0–20):
- Title contains decision-maker keywords (e.g., head, director, vp, cxo, ceo, cto, cmo, founder, owner, lead): +20
- Title contains influencer keywords (e.g., manager, specialist, analyst, coordinator, associate): +10
- Otherwise: +0
- ICP/industry match (0–20):
- Token overlap between
offer.ideal_use_cases
and the lead'sindustry
string. - 2+ overlapping tokens: +20; exactly 1 token: +10; else: +0
- Token overlap between
- Data completeness (0–10):
- All of name, role, company, industry, location, linkedin_bio are non-empty.
The final rule score is bounded to 0..50.
Prompt and AI mapping
The LLM prompt (see app/llm_chain.py
) requests a fixed two-line response:
Intent: <High|Medium|Low>
Reasoning: <one or two concise sentences>
We parse the intent via regex and map it to points:
- High → 50 points
- Medium → 30 points
- Low → 10 points
These AI points combine with the rule score to produce the final_score
.
Docker
Build the image:
docker build -t assignment-backend:latest .
Run the container (requires GEMINI_API_KEY
):
docker run --rm -p 8000:8000 \
-e GEMINI_API_KEY=your_key_here \
assignment-backend:latest
Now open http://localhost:8000/docs