All work
CompanyJames Chase
ClientAttercop
Product24/7 Digital Analyst (Pangolin AI)
RoleEngineer · full-stack ownership
PeriodJanuary 2026 — Present
TypeConsulting · RAG / agentic Q&A
The 24/7 Digital Analyst
Embedded with Attercop through James Chase to build Pangolin AI from the ground up — a RAG Q&A platform that grounds answers in proprietary analyst research, streams through an agentic LangGraph orchestrator, and ships with enterprise identity, indexing, and observability baked in.
FastAPI
LangGraph
Celery
RAG
React
PostgreSQL
Redis
Azure
4
Core platform layers delivered
~50K
Lines authored on services I owned
24/7
Analyst-grade Q&A for customers
The product
Attercop's 24/7 Digital Analyst is a Q&A experience backed by Pangolin AI. It ingests CMS research into vector and structured indexes, retrieves and reranks under per-customer entitlement rules, runs a streaming agentic orchestrator that fans across unstructured retrieval and Text2SQL tools, and returns answers to a React UI with inline citations.
Identity, billing, and access flow through an external Identity Platform (VIP) and a CMS-backed provider (Sitefinity). I owned end-to-end design and delivery across the services I touched — backend, infrastructure, and selected frontend work.
What I built
Identity bridge (Sitefinity ↔ VIP ↔ Pangolin)
- Built Pangolin's auth layer end-to-end: Sitefinity opaque-token introspection → VIP
/resolve → per-tenant session caching in Redis.
- Two-layer cache (L1 token, L2 session) with TTL tuned so token top-ups reflect in under ~30 seconds — without hammering VIP on every request.
- Three-state access model so token-exhausted users get read-only UI instead of a hard 403 that cleared tokens and looped the SDK through redirect.
- Fixed the blocked-user redirect loop by separating authentication errors from authorisation errors and adding a deterministic
/auth/status poll plus a no-access page that makes zero further API calls.
Indexing pipeline
- Celery + RabbitMQ workflow with separate IO, CPU, and image worker fleets — orchestrate → fetch → chunk → embed (batched cross-document) → vision-LLM enrichment → upsert.
- DLQ on every stage with per-stage retry policies; idempotency via watermarks and integrity checks after upsert.
- Cross-document embedding batching cut average API latency per chunk by roughly 6× versus single-document batching.
Streaming orchestrator & observability
- Stream-publisher primitives bridging LangGraph nodes to Atterchat SSE — including an inline-marker parser that strips
[N] from the wire while preserving markers for persistence.
- OTel + Langfuse traces nested under LangGraph node spans; prompt-version provenance carried through every chain.
- Citation reload and historical markers: hydration endpoints so conversation reload repopulates citations and splices inline chips from Pangolin's marker-bearing answers — not Atterchat's marker-free message store.
Platform hygiene
- OpenAPI and Celery task-reference generators that walk the live FastAPI route graph and task registry, with a
--check flag for CI so the public surface stays diff-reviewable.
- Access-state UX: read-only mode for token-exhausted users and frontend state machines that respect the three dependency variants on write vs read paths.
Constraints I designed for
- External dependencies on the critical path. Sitefinity and VIP sit on different teams and release cadences — defensive Pydantic boundaries, null-tolerant payload models, and stale-fallback on VIP outage so partner issues don't surface as hard errors for users.
- Polyglot scope. Features like read-only mode spanned Python (FastAPI, Redis TTL behaviour), TypeScript (access-gate state machine, polling), and SSE handler invariants — one feature, three surfaces.
- Two systems, one conversation. Atterchat owns message metadata; Pangolin owns citations and marker-bearing answers. The fix for historical reload was overlay at render time from the source of truth — not double-writing into Atterchat.
"Build clean boundaries between the systems you don't control — Sitefinity, VIP, Atterchat — and the system you do. Most of the bugs came from leaky boundaries; most of the fixes came from making them explicit."