OCR + transformers
hi everyone, I have a freelance data science project where I need to read 200k medical records PDFs , and extract the text data out of it. The current accuracy of the project is around 60% The client is hell bent to use some sort of transformers to extract the data.
If someone has knowledge of OCRs and Transformers and interested in collaborating on this project with me then please email me your resume and contact number to remotelute36@gmail.com
Hi, can you please share what you did in this project?
my team has worked on a similar project, we had the same problem statement(extracting name,age, doses, ICD codes(and also normalising this) from given hospital records.)
Transformers are for text-to-text. Try ViT. It's supposed to work well on image classification tasks so maybe you'd have to draw bounding boxes on each character and then classify it and append to the output text