PDF.chat API

Soo dejisan PDF iyo chat la mid ah app aad u gaar ah — su'aalo weydii oo hel jawaabaha ku xusan bogga, in 100 + luqadood. Mitir ah bogga, wax yaab leh ma jiro.

Faahfaahin

The PDF.chat API waa interface yar REST. Ugu horreyntii waxaad POST a qoraal ah in ay cunto oo dib u hesho shaqo la qoraalka qoraalka iyo per-page kala qaybinta (qoraalka, sanduuqyada bounding, kalsoonida). Markaas waxaad POST su’aalaha ka dhanka ah shaqadaas iyo hel jawaabaha ku saleysan qoraalka, mid kasta oo ka hadlaya bogga uu ka yimid. Shaqada 5 bog ama ka yar inline soo laabta; shaqooyinka waaweyn soo laabta si deg deg ah oo la pending xaaladda aad doorasho ilaa done.

  • URL-ka asalka ah: https://pdf.chat
  • Xuquuqda: PDF, iyo sidoo kale Word, PowerPoint, qoraalka, iyo sawirro (PNG, JPG, WEBP, GIF, BMP, TIFF)
  • Chat ka mid ah: jawaabaha la page ciwaanka; transcripts ka dib markii taariikhda endpoint
  • Qoraalka la xalliyey: txt, md, docx, pdf, csv, json
  • Akhrista mashiinada: cpu (dhab ah, dukumiintiyo daabacan) iyo vlm (premium AI, qoraalka gacanta, qaabka adag, xisaabta)

Adeegga aqoonsiga

Ku aqoonso Tallaabada API (ka heli kartaa on aad Tusaale:) sida madaxa Bearer:

Authorization: Bearer YOUR_API_TOKEN

Waxaad sidoo kale ku gudbi kartaa ?api_token=… sida parameter su'aal. isticmaalka waxaa lagu tijaabiyay ka dhanka ah miisaanka page xisaabtaada.

Soo gudbi xuquuqda

POST /api/v1/ocr/, multipart qaabka soo dejinta.

curl -X POST https://pdf.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@invoice.pdf" \
  -F "tier=vlm" \
  -F "language=auto"

U soo celin shaqada. Si loo helo ≤5-page files waa horeyba done la qoraalka; faylasha weyn ayaa dib u soo laabtay pending/processing, doorasho xaaladda endpoint.

{
  "uuid": "9f2c1b7e4a...",
  "status": "done",
  "tier": "vlm",
  "language": "auto",
  "page_count": 1,
  "mean_confidence": 0.98,
  "text": "INVOICE\nAcme Corp\nTotal: 215.00 USD",
  "markdown": "# INVOICE\n\n**Acme Corp** ...",
  "pages": [ { "index": 0, "text": "...", "blocks": [ { "text": "...", "bbox": [x0,y0,x1,y1], "confidence": 0.98 } ] } ]
}

Ka hesho natiijo

GET /api/v1/ocr/<uuid>/, tijaabiyo ilaa status waa done ama failed.

curl https://pdf.chat/api/v1/ocr/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Soo deji qaabka

GET /api/v1/ocr/<uuid>/download/?format=md, dhoofinta natiijada. format waa mid ka mid ah txt, md, docx, pdf, csv, json.

curl -L "https://pdf.chat/api/v1/ocr/9f2c1b7e4a.../download/?format=docx" \
  -H "Authorization: Bearer YOUR_API_TOKEN" -o result.docx

Chat la xuquuqda

Su'aalo weydii shaqada dhamaatay. Jawaabaha waxaa ku saleysan oo kaliya qoraalka soo bixi iyo soo jeedin bogga asalka ah. U baahan yahay xisaab token, chat feature waa xisaab-gated.

POST /api/v1/chat/<uuid>/, JSON jirka {"message": "your question"}.

curl -X POST https://pdf.chat/api/v1/chat/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the invoice total and due date?"}'

Ku soo laabtaa fariinta caawiyaha la jawaabta iyo liiska bogagga la xusay:

{"conversation": "a1b2…", "message": {
   "role": "assistant",
   "content": "The total is $42, due on March 3 (p. 1).",
   "citations": [{"page": 1, "cited_text": "The invoice total is $42…", "document_id": "9f2c1b7e4a…"}]
}}

GET /api/v1/chat/<uuid>/history/, soo qaado qoraalka wada hadalka oo dhan shaqada.

Codsiga tusaale

import requests, time

BASE = "https://pdf.chat/api/v1"
H = {"Authorization": "Bearer YOUR_API_TOKEN"}

# 1. Upload a PDF
with open("contract.pdf", "rb") as f:
    job = requests.post(BASE + "/ocr/", headers=H, files={"file": f}).json()

# 2. Wait until it's ready to chat
while job["status"] in ("pending", "processing"):
    time.sleep(2)
    job = requests.get(f"{BASE}/ocr/{job['uuid']}/", headers=H).json()

# 3. Ask questions — every answer is cited to the page
ans = requests.post(f"{BASE}/chat/{job['uuid']}/", headers=H,
    json={"message": "What is the termination notice period?"}).json()
print(ans["message"]["content"])
print(ans["message"]["citations"])
import fs from "fs";

const BASE = "https://pdf.chat/api/v1";
const H = { Authorization: "Bearer YOUR_API_TOKEN" };

// 1. Upload a PDF
const form = new FormData();
form.append("file", new Blob([fs.readFileSync("contract.pdf")]), "contract.pdf");
let job = await (await fetch(`${BASE}/ocr/`, { method: "POST", headers: H, body: form })).json();

// 2. Wait until it's ready to chat
while (["pending", "processing"].includes(job.status)) {
  await new Promise(r => setTimeout(r, 2000));
  job = await (await fetch(`${BASE}/ocr/${job.uuid}/`, { headers: H })).json();
}

// 3. Ask questions — every answer is cited to the page
const ans = await (await fetch(`${BASE}/chat/${job.uuid}/`, {
  method: "POST", headers: { ...H, "Content-Type": "application/json" },
  body: JSON.stringify({ message: "What is the termination notice period?" })
})).json();
console.log(ans.message.content, ans.message.citations);
# 1. Upload a PDF
curl -X POST https://pdf.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@contract.pdf"

# 2. Ask questions (use the uuid from step 1) — answers cited to the page
curl -X POST https://pdf.chat/api/v1/chat/UUID/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the termination notice period?"}'

Parameters

Isku-darkaNoocTilmaamaha
filefileWaa in la buuxiyaa. Sawirka ama PDF si loo xaliyo.
tierstringcpu (default, degdeg ah/printed) ama vlm (premium AI: qorista gacanta, qaabka, xisaabta).
languagestringauto (default) ama koodka afka (en, ch, ja, ar, …).
toolstringQalabka ikhtiyaariga ah slug (tusaale ahaan summarize-pdf, ask-pdf) si ay u hor-qaab dhismeedka chat ee shaqadaas.

Xaaladda

KoodkaMaxaa la rabaa
400Fayl ma jiro, nooc aan la taageerin, ama fayl aad u weyn.
401Tallaabooyin la'aan ama aan sharci ahayn API.
402Bogga ka baxsan, maalin kasta / bil kasta oo xor ah xaddidaadda la gaadhay, ama aan la credits. used/cap.
404Shaqada UUID ma la helay.
409Soo dejinta ayaa la weydiistay ka hor shaqada oo dhammaaday.

Bog kasta oo la soo gudbiyay qiimaha credits (1 / page on the fast tier, in ka badan on premium). qorshayaasha lacagta lagu bixiyo kor u qaado per-file page caps iyo ku dar muhiimadda. qiimaha.

Su'aalaha badanaa la isweydiiyo

abuuro xisaab bilaash ah oo fur Tusaale:, your token waxaa lagu muujiyey halkaas oo la badhanka koobida.

Haa, files of 5 bog ama ka yar ku soo laabtaan natiijada buuxda inline in jawaabta POST, sidaas daraaddeed ma doorasho loo baahan yahay in ay sawirro badan iyo PDFs gaaban.

In ka badan 100, oo ay ku jiraan Latin, CJK, Carabi, Cyrillic iyo qoraalada Hindi. language=auto si ay u ogaato, ama u gudbiyaan koodh gaar ah.

Uploads waxaa lagu dhaqmaa oo keliya si ay u jawaabaan su'aalahaaga oo si otomaatig ah loo tirtiro. Waxaan marnaba iibsan, qaybinta, ama tababarka ku saabsan dukumintigaada.

Isticmaalka waa la miisaamay bogga oo dhan oo ka dhan ah xisaabtaada xisaabtaada: wicitaanada aan la aqoonsan karin waxay helayaan mushahar maalinle ah oo IP ah, xisaabaadka bilaashka ah ee bishii, iyo qorshayaasha la bixiyo waxay isticmaalaan lacagaha la iibsaday oo leh daboolka iyo kobaca bogga faallooyinka ee sare. Markaad ka baxdo waxaad heshaa 402 oo leh mid la isticmaalay oo ku jira jirka.

Waxaad u diri kartaa PNG, JPG, WEBP, GIF, BMP, TIFF, iyo multi-page PDF. Natiijooyinka soo dejisan sida txt, MD, DOCX, PDF (searchable), CSV, ama JSON ka soo bixi endpoint ee soo dejinta qaabka parameter.

400 waa file la'aan, nooca aan la taageerin, ama file aad u weyn; 401 a la'aan ama xadgudubka sharci darro; 402 ka soo baxay boggaga; 404 shaqo aan la aqoon UUID; iyo 409 a soo dejinta codsaday ka hor shaqada dhamaatay.

A shaqo wax la xaaladda, tier, afka, page_count, iyo mean_confidence, iyo qoraalka buuxa iyo markdown. Dukumiintiyada array ka go'a dukumiinti kasta oo blocks la qoraalkooda, box bounding (bbox), iyo per-blokka kalsoonida.

U isticmaal cpu (default) si degdeg ah, qiimo jaban aqoonsiga dacwadaha daabacan nadiif ah. U isticmaal vlm, mashiinka premium AI, si ay u qoraan, adag ama multi-tiirka qaababka, xisaabta, iyo turjumaadda, halkaas oo ay tahay wax badan oo ka badan oo sax ah.

Pass qalab la slug (sida tusaale ah, soo koobi-pdf ama weydiiso-pdf) si ay u hor-qaab chat ee shaqadaas, sidaas awgeed, la taliyuhu waa la sifeeyay in ay soo koobi ama jawaab su'aalaha ku saabsan dukumintiga.

Faylasha 5 bog ama ka yar inline ku soo laabto jawaabta POST. Faylasha weyn dib u soo noqdaan si deg deg ah sida la sugayo ama wax soo saarka, oo aad codbixintu GET /api/v1/ocr/<uuid>/ ilaa xaaladda la sameeyo ama fashil. qorshayaasha la bixiyo kor u qaado per-file page cap.

API waa REST caadi ah oo ka badan HTTPS, sidaas darteed waxay ka shaqeysaa luqadda kasta oo leh macaamiisha HTTP, fiiri Python, Node.js, iyo cURL tusaalooyinka kor ku xusan. Ma jiro SDK in la rakibo; qaar ka mid ah liiska koodhka HTTP caadiga ah waa waxa aad u baahan tahay.