Including Text Recognition in App

Hey guys!
So my plan is to create a real-time text recognition app.

To start out I am currently trying to make a simple on-device (not via an cloud API) OCR app which uses a photo taken from the CameraView ,extracts text and then displays it on the screen. I looked into possible solutions a lot and came to the conclusion that Tesseract might be the best choice. Now since I am coming from the web dev world I have no expertise in working with foreign code - ObjC I did not even know where to start. It feels like I would need to spend much much more time in learning ObjC. and Uno in order to solve a very rare issue. Being optimistic I believe there has to be a better and efficient way to do this and thats why I am asking you for help.

I found out that tesseract actually also provides a JS version:
The problem is that I am not really sure about how to implement npm packages in fuse and if its even gonna work anyway since it probably strongly depends on node.js core features…

Would love to hear your thoughts! Thanks

Just had a quick look at this now to see how far I’d get, here’s where I got to: (161.1 KB)

I was basically trying to process a base64 image as a test, if you can get that going, then getting an image from the CameraView isn’t a hard stretch.

Got stuck on the worker.js part but hopefully this gives you a boost to getting started, all the best and lemme know how it goes.

I see someone also asked a similar Q: Writing app with stylus input and OCR

1 Like

This will probably be better and more performant:

1 Like

The new firebase MLkit is definitely the way to go. Tesseract has long since been deprecated.

1 Like

Thanks so much! Unfortunately I was not able get it working yet…

Thanks. Thought about that too. The problem is that MLkit only provides Swift|ObjC. code and given some youtube tutorials you also have to quite a lot in xCode. Tbh it is very hard for me to get into ObC. let alone implementing it via Uno in fuse…

MLKit does Java too :stuck_out_tongue: but you’re probably looking for a very high level way to go about it right?

1 Like

Yes I saw that too :stuck_out_tongue: Indeed, I am.

Ok, in that case, I would POST my image to Google’s Vision API directly for ML analysis, huge server farms for more performance than a single device could ever yield…

Yes, you are right! However, I do not want to use the API. If the API request exceeds a certain amount Google will charge. The on-device OCR would be enough for my needs and I do not have to think about what happens once more people than I thought use the app… Hm I might have to move to Nativescript for that project which is kinda sad.

Yeah, I see, offline usage and high level is key for you, so yeah one of the hybrids is probably better for ya but you’ll be missing out on the awesome performance n dev experience. The native integration is really like connecting the inputs and outputs from the module (Obj C & Java) to Uno to UX or Javascript, anyways, all the best with your mission man.

1 Like