The Document Index / PDF Extraction / #114
CatchTheTornado/text-extract-api
by CatchTheTornado · PDF Extraction · updated 6mo ago
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
39
momentum
3,104
stars
276
forks
#114
rank
anonymizationapiextractjsonllmocrocr-pythonpdfpii
View on GitHub →