Kymata Labs/The Living IndexesBuilt by tekvisions ↗
The Document Index / Document Parsing / #159
huggingface

huggingface/chug

by huggingface · Document Parsing · updated 2y ago

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

25
momentum
162
stars
10
forks
#159
rank
computer-visiondataloadingdatasetsdistributed-trainingdocument-understandingmulti-modal-learningpdf-documentwebdataset
View on GitHub →