Semantic car search with an LLM ranking pipeline

An AI search platform for used cars in Argentina. It continuously scrapes and scores marketplace listings, then answers natural-language queries with vector retrieval and LLM ranking.

Overview

Under the hood

A search platform for used cars in Argentina, built as a pnpm monorepo: a Next.js app on Vercel plus Python workers on Railway that continuously scrape MercadoLibre and Kavak through residential proxies and bot-detection evasion. Free-text queries like 'first car under $10M' run through an LLM pipeline that extracts structured criteria, retrieves candidates with Voyage embeddings over Postgres and pgvector, and ranks each listing with Claude for price and quality.

The hot path is tuned for latency and cost: structured calls run on a faster model, database work is parallelized, and the most expensive recommendation call is deferred off the blocking response so listings render first and the analysis fills in progressively. Bulk enrichment moves through the Anthropic Message Batches API, with Redis caching and a deterministic pre-parser (guarded by an eval) skipping the LLM on high-confidence queries.

Next.js · Python scrapers · pgvector · Claude + Voyage embeddings

Natural-language search

Free text becomes structured criteria, a ranked shortlist, and an expert recommendation.

Vector retrieval

Candidates are recalled with Voyage embeddings over Postgres and pgvector, then re-ranked by the LLM.

Resilient scraping

Headless Chromium with stealth and residential proxies clears bot protection at volume.

Tuned hot path

Model tiering, parallel queries, batch enrichment, and Redis caching keep search fast and cheap.

RoleFull-stack & AI · sole engineer

StackNext.js · Python scrapers · pgvector · Claude + Voyage embeddings

ClientIndependent

Years2026

Natural-language SearchLLM PipelinepgvectorVoyage EmbeddingsPlaywright ScrapersAnti-bot EvasionMessage Batches APIRedis CachingProgressive LoadingReal-time Alerts

Want to build something like this?

Get in touch

Next projectCompetitive pricing intelligence with multi-store scrapersIndependent · 2025-2026