PDF.chat API

Unggah PDF sareng chat sareng éta tina aplikasi anjeun nyalira - naroskeun patarosan sareng kéngingkeun jawaban anu dikutip kana halaman, dina 100 + basa. Diukur per halaman, henteu aya anu ngaherankeun.

Gambaran Umum

The PDF.chat API iku antarmuka REST cilik. Kapisan sampeyan POST Dokumen kanggo ngethok lan nampa maneh tugas karo teks dokumen lan panyebaran saben-saben kaca (teks, kotak wates, kepercayaan). Sawise iku sampeyan bisa POST pitakonan marang kerja iku lan njaluk jawaban didhasarake ing dokumen, saben citing kaca iku teka saka. Jobs saka 5 kaca utawa kurang bali inline; jobs luwih gedhe bali langsung karo pending status that you poll until done.

  • URL Dasar: https://pdf.chat
  • Dokumen ing: PDF, plus Word, PowerPoint, teks, lan gambar (PNG, JPG, WEBP, GIF, BMP, TIFF)
  • Chat metu: jawaban karo citra kaca; transkripsi liwat titik pungkasan sejarah
  • Teks diproses: txt, md, docx, pdf, csv, json
  • Ngetik cpu (dokumèn cetak cepet) lan vlm (premium AI, tulisan tangan, tata letak rumit, matematika)

Autentikasi

Autentikasi karo akun sampeyan Token API (goleki ing page) minangka header Bearer:

Authorization: Bearer YOUR_API_TOKEN

Sampeyan uga bisa ngetutake ?api_token=… salaku parameter panalungtikan. Panggunana diukur ngalawan saldo halaman akun anjeun.

Kirim dokumen

POST /api/v1/ocr/, multipart formulir upload.

curl -X POST https://pdf.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@invoice.pdf" \
  -F "tier=vlm" \
  -F "language=auto"

Kembalikan tugas. Kanggo file ≤5- halaman wis ana done karo teks; file sing luwih gedhe bakal teka maneh pending/processing, poll the status endpoint.

{
  "uuid": "9f2c1b7e4a...",
  "status": "done",
  "tier": "vlm",
  "language": "auto",
  "page_count": 1,
  "mean_confidence": 0.98,
  "text": "INVOICE\nAcme Corp\nTotal: 215.00 USD",
  "markdown": "# INVOICE\n\n**Acme Corp** ...",
  "pages": [ { "index": 0, "text": "...", "blocks": [ { "text": "...", "bbox": [x0,y0,x1,y1], "confidence": 0.98 } ] } ]
}

Ngambil asil

GET /api/v1/ocr/<uuid>/, poll until status iku done utawa failed.

curl https://pdf.chat/api/v1/ocr/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Muat turun format

GET /api/v1/ocr/<uuid>/download/?format=md, Eksport hasil. format salah siji saka txt, md, docx, pdf, csv, json.

curl -L "https://pdf.chat/api/v1/ocr/9f2c1b7e4a.../download/?format=docx" \
  -H "Authorization: Bearer YOUR_API_TOKEN" -o result.docx

Chat karo dokumen

Nyaetakeun patarosan ngeunaan tugas anu réngsé. Jawabanna ngan ukur didasarkeun kana teks anu dieksekusi sarta nyalin halaman sumber. Peryogikeun token akun, fitur chat dijaga ku akun.

POST /api/v1/chat/<uuid>/, Tubuh JSON {"message": "your question"}.

curl -X POST https://pdf.chat/api/v1/chat/9f2c1b7e4a.../ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the invoice total and due date?"}'

Kembalikan pesan asisten kanthi jawaban lan dhaptar kaca sing dikutip:

{"conversation": "a1b2…", "message": {
   "role": "assistant",
   "content": "The total is $42, due on March 3 (p. 1).",
   "citations": [{"page": 1, "cited_text": "The invoice total is $42…", "document_id": "9f2c1b7e4a…"}]
}}

GET /api/v1/chat/<uuid>/history/, nelusuri transkripsi konvertasi lengkap kanggo tugas.

Conto kode

import requests, time

BASE = "https://pdf.chat/api/v1"
H = {"Authorization": "Bearer YOUR_API_TOKEN"}

# 1. Upload a PDF
with open("contract.pdf", "rb") as f:
    job = requests.post(BASE + "/ocr/", headers=H, files={"file": f}).json()

# 2. Wait until it's ready to chat
while job["status"] in ("pending", "processing"):
    time.sleep(2)
    job = requests.get(f"{BASE}/ocr/{job['uuid']}/", headers=H).json()

# 3. Ask questions — every answer is cited to the page
ans = requests.post(f"{BASE}/chat/{job['uuid']}/", headers=H,
    json={"message": "What is the termination notice period?"}).json()
print(ans["message"]["content"])
print(ans["message"]["citations"])
import fs from "fs";

const BASE = "https://pdf.chat/api/v1";
const H = { Authorization: "Bearer YOUR_API_TOKEN" };

// 1. Upload a PDF
const form = new FormData();
form.append("file", new Blob([fs.readFileSync("contract.pdf")]), "contract.pdf");
let job = await (await fetch(`${BASE}/ocr/`, { method: "POST", headers: H, body: form })).json();

// 2. Wait until it's ready to chat
while (["pending", "processing"].includes(job.status)) {
  await new Promise(r => setTimeout(r, 2000));
  job = await (await fetch(`${BASE}/ocr/${job.uuid}/`, { headers: H })).json();
}

// 3. Ask questions — every answer is cited to the page
const ans = await (await fetch(`${BASE}/chat/${job.uuid}/`, {
  method: "POST", headers: { ...H, "Content-Type": "application/json" },
  body: JSON.stringify({ message: "What is the termination notice period?" })
})).json();
console.log(ans.message.content, ans.message.citations);
# 1. Upload a PDF
curl -X POST https://pdf.chat/api/v1/ocr/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@contract.pdf"

# 2. Ask questions (use the uuid from step 1) — answers cited to the page
curl -X POST https://pdf.chat/api/v1/chat/UUID/ \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the termination notice period?"}'

Параметри

LapanganTypeGambaran
filefileDiperlukan. Gambar utawa PDF kanggo diproses.
tierstringcpu (piawai, cepet/dicetak) utawa vlm (premium AI: tulisan tangan, layout, matematika).
languagestringauto (piranti lunak) utawa kode basa (en, ch, ja, ar,...).
toolstringKajaba iku, ana uga wangun-wangun dhasar (e.g. summarize-pdf, ask-pdfchat) kanggo preframe chat kanggo tugas iki.

Kesalahan lan watesan

KodeMakna
400Ora ana file, jinis ora didhukung, utawa file akèh banget.
401Token API kang ilang utawa ora sah.
402Luwih saka kaca, watesan gratis saben dina/wulan wis tekan, utawa ora ana kredit. Tubuh kalebu used/cap.
404UUID tugas ora ditemokake.
409Muter- unduh dibutuhake sadurunge tugas rampung.

Saben kaca sing diproses regane kredit (1/kaca ing tingkat cepet, luwih ing premium). Rencana sing dibayar nambahi watesan kaca saben file lan nambah prioritas. Lihat harga.

Takon-takon kang kerep diajukake

Nggawe akun gratis lan buka akun sampeyan page, token anjeun ditampilake di dieu nganggo tombol salinan.

Sacara umum, 5000 taun ka pengker, wangun-wangun ieu dipikawanoh ku masarakat umum, tapi henteu dipikawanoh ku para ahli.

Luwih saka 100, kalebu aksara Latin, CJK, Arab, Cyrillic lan India. Guna language=auto kanggo ndeteksi, utawa nyelehake kode tartamtu.

Uploads diolah ngan pikeun ngajawab patarosan anjeun sarta dipaling sacara otomatis. Kami henteu pernah ngajual, babagi, atawa ngalatih dina dokumén anjeun.

Panggunaan diukur per halaman ngalawan saldo akun anjeun: Panggero anonim meunangkeun per-IP per dinten, akun bébas meunangkeun per bulan, sarta rencana dibayar nganggo kredit anu dibeli kalayan wates halaman per-file anu langkung luhur sareng prioritas. Nalika anjeun leungit, anjeun meunangkeun 402 kalayan dipaké sareng wates dina awak.

Anjeun bisa ngirim PNG, JPG, WEBP, GIF, BMP, TIFF, sarta PDF multi-page. Hasilna diunduh minangka txt, md, docx, pdf (bisa dicari), csv, atawa json ngaliwatan parameter format titik akhir diunduh.

400 nyaéta file anu kaleungitan, jenis anu teu didukung, atawa file anu gedé teuing; 401 hiji token kaleungitan atawa teu sah; 402 kaluar tina halaman; 404 hiji UUID tugas anu teu kawanoh; sarta 409 hiji ngundeur anu dipikabutuh saméméh tugas réngsé. Tubuh kasalahan ngawengku pesen pondok.

Sabuah objek tugas kalawan status, tingkat, basa, page_count, jeung mean_confidence, ditambah teks pinuh jeung markdown. Array halaman ngabédakeun unggal halaman jadi blok kalawan teksna, kotak wates (bbox), sarta per-blok confidence.

Gunakeun cpu (piawai) pikeun ngaidentipikasi dokumen cetak bersih kalayan gancang, sareng henteu mahal. Gunakeun vlm, mesin AI premium, pikeun tulisan tangan, tata letak kompleks atanapi multi-jubin, matematika, sareng terjemahan, dimana éta langkung akurat.

Pass tool with a slug (for example summarize-pdf or ask-pdf) to pre-frame the chat for that task, so the assistant is tuned to summarize or answer questions about the document.

Berkas kang ukurane 5 kaca utawa kurang bakal dipulangake ing respon POST. Berkas kang luwih gedhé bakal langsung dipulangake minangka sing durung rampung utawa diproses, lan sampeyan bakal takon GET /api/v1/ocr/<uuid>/ nganti status wis rampung utawa gagal. Rencana dibayar narik kaca per-file cap.

API nyaéta REST biasa dina HTTPS, janten éta damel tina basa naon waé kalayan klien HTTP, tingali conto Python, Node.js, sareng cURL di luhur. Teu aya SDK pikeun dilebetkeun; sababaraha garis kode HTTP standar anu anjeun peryogikeun.