PDF.chat API
Een HTTP-versoek maak 'n beeld of PDF in skoon teks, Markdown, tabelle en JSON, in 100+ tale. Met behulp van per bladsy, geen verrassings nie.
Oorsig
The PDF.chat API is a small REST interface. First you POST 'n dokument om dit in te neem en terug te kry' n werk met die dokument se teks en 'n per-bladsy ineenstorting (teks, gebindde bokse, selfvertroue). Dan sal jy POST vrae teen daardie werk en antwoorde te kry wat op die dokument gebaseer is, elkeen wat die bladsy aanhaal waaruit dit kom. Werke van 5 bladsye of minder terug inlyn; groter werke keer onmiddellik terug met'n pending status that you poll until done.
- Basis URL:
https://pdf.chat - Dokumente in: PDF, plus Woord, PowerPoint, text en beelde (PNG, JPG, WEP, GIF, BMP, TIFF)
- Chat out: antwoorde met bladsyverwysings; transkripsie via die geskiedenis-punt
- Geprosesseerde teks uit:
txt,md,docx,pdf,csv,json - Lees enjins:
cpu(vinnig, gedrukte docs) envlm(premium Kunsmatige inteligensie, handskrif, ingewikkelde uitleg, wiskunde)
Verifiëring het gefaal
that didn't work to view to be read API-bewys (Hou dit op jou rekeningblad) as'n Beerkop:
Authorization: Bearer YOUR_API_TOKEN
Jy kan ook verbygaan ?api_token=… as 'n navraag parameter. Gebruik word geëneer teen jou rekening se bladsybalans.
Stuur 'n dokument in
POST /api/v1/ocr/, Aol Oombliklike Boodskapper KliëntName
curl -X POST https://pdf.chat/api/v1/ocr/ \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-F "file=@invoice.pdf" \
-F "tier=vlm" \
-F "language=auto"
Gee terug die werk. Vir \\ 5-bladsy lêers wat dit alreeds is done met die teks; groter lêers kom terug pending/processing, stem saam met die statuspunt.
{
"uuid": "9f2c1b7e4a...",
"status": "done",
"tier": "vlm",
"language": "auto",
"page_count": 1,
"mean_confidence": 0.98,
"text": "INVOICE\nAcme Corp\nTotal: 215.00 USD",
"markdown": "# INVOICE\n\n**Acme Corp** ...",
"pages": [ { "index": 0, "text": "...", "blocks": [ { "text": "...", "bbox": [x0,y0,x1,y1], "confidence": 0.98 } ] } ]
}
Kry 'n resultaat
GET /api/v1/ocr/<uuid>/, Meningspeiling tot status is done of failed.
curl https://pdf.chat/api/v1/ocr/9f2c1b7e4a.../ \
-H "Authorization: Bearer YOUR_API_TOKEN"
Laai 'n formaat af
GET /api/v1/ocr/<uuid>/download/?format=md, Voer die resultaat uit. format is een van txt, md, docx, pdf, csv, json.
curl -L "https://pdf.chat/api/v1/ocr/9f2c1b7e4a.../download/?format=docx" \
-H "Authorization: Bearer YOUR_API_TOKEN" -o result.docx
Gesels met 'n dokument
Vra vrae oor 'n voltooide werk. Antwoorde word slegs in die verskepde teks gehok en haal die bronblad aan. Vereis' n rekeningoms teken, die gesels funksie is rekening-aangeteken.
POST /api/v1/chat/<uuid>/, DORK {"message": "your question"}.
curl -X POST https://pdf.chat/api/v1/chat/9f2c1b7e4a.../ \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message": "What is the invoice total and due date?"}'
Gee terug die assistent boodskap met sy antwoord en 'n lys van teksverwysings:
{"conversation": "a1b2…", "message": {
"role": "assistant",
"content": "The total is $42, due on March 3 (p. 1).",
"citations": [{"page": 1, "cited_text": "The invoice total is $42…", "document_id": "9f2c1b7e4a…"}]
}}
GET /api/v1/chat/<uuid>/history/, haal die volledige gesprekskripsie vir werk.
Kommentaarvoorbeeld
import requests, time
BASE = "https://pdf.chat/api/v1"
H = {"Authorization": "Bearer YOUR_API_TOKEN"}
# 1. Upload a PDF
with open("contract.pdf", "rb") as f:
job = requests.post(BASE + "/ocr/", headers=H, files={"file": f}).json()
# 2. Wait until it's ready to chat
while job["status"] in ("pending", "processing"):
time.sleep(2)
job = requests.get(f"{BASE}/ocr/{job['uuid']}/", headers=H).json()
# 3. Ask questions — every answer is cited to the page
ans = requests.post(f"{BASE}/chat/{job['uuid']}/", headers=H,
json={"message": "What is the termination notice period?"}).json()
print(ans["message"]["content"])
print(ans["message"]["citations"])
import fs from "fs";
const BASE = "https://pdf.chat/api/v1";
const H = { Authorization: "Bearer YOUR_API_TOKEN" };
// 1. Upload a PDF
const form = new FormData();
form.append("file", new Blob([fs.readFileSync("contract.pdf")]), "contract.pdf");
let job = await (await fetch(`${BASE}/ocr/`, { method: "POST", headers: H, body: form })).json();
// 2. Wait until it's ready to chat
while (["pending", "processing"].includes(job.status)) {
await new Promise(r => setTimeout(r, 2000));
job = await (await fetch(`${BASE}/ocr/${job.uuid}/`, { headers: H })).json();
}
// 3. Ask questions — every answer is cited to the page
const ans = await (await fetch(`${BASE}/chat/${job.uuid}/`, {
method: "POST", headers: { ...H, "Content-Type": "application/json" },
body: JSON.stringify({ message: "What is the termination notice period?" })
})).json();
console.log(ans.message.content, ans.message.citations);
# 1. Upload a PDF
curl -X POST https://pdf.chat/api/v1/ocr/ \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-F "file=@contract.pdf"
# 2. Ask questions (use the uuid from step 1) — answers cited to the page
curl -X POST https://pdf.chat/api/v1/chat/UUID/ \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message": "What is the termination notice period?"}'
Parameters
| Veld | Tipe | Beskrywing |
|---|---|---|
file | file | Benodig. Die beeld of PDF om te verwerk. |
tier | string | cpu (verstek, vinnige/printed) of vlm (premium-KI: handskrif, uitleg, wiskunde). |
language | string | auto (verstek) of â n taalkode (verstek)en, ch, ja, ar,... |
tool | string | Opsionele gereedskap slief (bv. summarize-pdf, ask-pdf) to pre-frame the chat for that task. |
Foute en beperkings
| Kommentaar | Gebeteken |
|---|---|
400 | Geen lêer, nieondersteunde tipe of lêer te groot. |
401 | Ontbrekende of ongeldige API-bewys. |
402 | Uit bladsye, daaglikse/maande vrye perk bereik, of geen krediete nie. Die liggaam sluit in used/cap. |
404 | Job UUID is nie gevind nie. |
409 | Aflaai klaar gemaak |
Elke bladsy verwerkte pryse kos krediet (1/bladsy op die vinnige vlak, meer op premium). Paid planne lig per- file bladsy Caps op en voeg prioriteit by. Sien Plaag.
Dikwels gevra vrae
language=auto om'n spesifieke kode te bepaal of te bepaal.