Skip to main content

My experience in iOS Hackathon

This is my second hackathon, my first hackathon was on machine learning if you want to check out that article by following the below link
https://thangaayyanar.blogspot.com/2018/02/what-i-learned-from-machine-learning.html

So let's get started

First let us discuss about the idea of what we are trying to achieve in this hackathon.






From the above image you can able to know that we are going to recognize text from the image and use it to do find which field it is.

 we separated this idea into three modules
  1. Identify the region
  2. Recognize the text 
  3. Field classification
Module I : Identify the region
  • To identify the selected region we used Vision framework ( ML framework provided by apple to detect the object ).
  • The vision framework give us the boundary of the text region ( i.e frame - x,y,width,height ).
  •  Then using the above region we crop the selected region and pass it to the next module.
Module II : Recognize the text
  • To recognize the text we tried lot of methods.
  • First thing we go with Tesseract ( It is a Open Source OCR Engine ).
  •  Tesseract take little bit time to configure and accuracy of the system is OK and it has one advantage we can train it with our data.
  • Next we check out MLCore from google which has text detection .
  • Configuration is not that hard if we followed the google docs and it's accuracy is better than Tesseract .
  • When we integrate the MLCore we need to configure Firebase and make our project size pretty big (> 300 MB)
  • Text detection support online and offline. we used offline version.
  • Then we stick with google MLCore for accuracy.
Module III: Field classification
  • To classify the Field we use Naive Bayes classifier
  • We didn't use Machine learning model here because we don't have time.
  • So we borrowed the code from a github developers who write a code how to do navie bayes in swift 
  • The problem with this, we need to train it every time app start.
Now every module is done we integrated all the module. It worked not to our expected level and we figured where the problem is - it is Vision , i think we missed some thing so some text region are not cropped properly and time runs out

We submitted our project

Well we learnt lot of things from this hackathon
  • Working with Machine learning in iOS
  • We can able to convert model we trained in python to model that support in iOS done through CoreML tools
  • Came across few new terms - CNN ( Convolutional Neural Network ), SVM, Classification algorithm
  • Framework to develop model - keras,caffe,Tensorflow,scikit
I will drop few the link that i refer in this hackathon

Have a great day

Resources: 
Bonus Resources:

Comments

Popular posts from this blog

Vim - Text Editor which last for Decade

what's Vim?     Vim is a highly configurable text editor for efficiently creating and changing any kind of text. It is included as "vi" with most UNIX systems and with Apple OS X                                                                                                            ---> From Vim.org  when i first heard it, what a command line editor which is awesome and i said to myself  NO WAY, there are tons of editor which looks good and easy learn curve such as Atom,Sublime,VSCode and bunch others What makes vim special than other editors?  Different from everything you have used before ( because it has modes - insert mode,visual mode,Command mode ) Forget the mouse ( why?...

Demystify - Linux GUI

GUI In Linux GUI ( Graphics User Interface ) as everyone know about.  I am writing this article so that we can able to understand how to run GUI apps in containers but we need to understand how it works in linux. Why linux?   Most of the container we use are Linux based inorder run GUI in Linux we need know how it works.. Back in early days computer fill the entire room and if you want to access it you will be presented TTY (TeleType Machine) you can still see this screen if you press CTRL + ALT + F1 in Linux. ( To get back to GUI press CTRL + ALT + F7 ) Linux spin off 8 TTY when it boots ( we can configure more or less ) Graphics in linux is handled by bunch of little programs. They are Display manager Display manger which is the key component for graphics in which mainly graphics servers lie in linux the X.org is the defacto of Display manger. which has two components X Server  X Client Here little twist server talks to the client ( o...