The Document Index / PDF Extraction / #32
yfedoseev/pdf_oxide
by yfedoseev · PDF Extraction · updated today
The fastest PDF library for Python and Rust. Text extraction, image extraction, markdown conversion, PDF creation & editing. 0.8ms mean, 5× faster than industry leaders, 100% pass rate on 3,830 PDFs. MIT/Apache-2.0.
65
momentum
823
stars
94
forks
#32
rank
data-extractiondocument-processingfastimage-extractionllmmarkdownpdfpdf-editorpdf-generationpdf-librarypdf-parserpdf-to-markdown
View on GitHub →