Browse by Domains

We were able to get precision scores of more than 79% on our AI model – Akshay Ram, PGP AIML

Table of contents
aiml

Applying AI at work can help you improve the process. Learn about Akshay’s journey with Great Learning’s PGP Artificial Intelligence and Machine Learning Course in his own words.

I am Akshay from Chennai, employed in Neoware Technology Solutions as a Data Scientist. I was a Navigational Officer in the merchant navy for six years.

Problem: Given a string with a heavily varying format, I used Ai to extract the required fields. I Created a basic annotation tool in a jupyter notebook to annotate data until a concrete annotation tool was found. Created annotated data from a string, created a tool that was capable of converting the annotated data into the format required for training.
Trained the NER, Tested it, and deployed it onto the client’s Ubuntu server using flask, enabled multi-threading to serve multiple calls at once, and deployed it as a service to have the service up and running in the event of a server restart.

Initially, due to the absence of a data annotation tool and data itself, We were extracting the required fields using very complicated regular expression and pattern matching look-ups. Though we were able to extract some amount of information, the performance wasn’t up to the mark, regex being very computationally expensive, consuming a lot of memory and time to process string. Used Spacy, OpenCv, SBERT, Computer-vision models, Tensorflow-Keras

The recommendation was to ditch regular expressions instead of making it more and more complex for specific cases and to switch over to a production-ready, easy-to-implement NER model. Though many libraries are production-ready, producing proper data for training was a big challenge. We tried multiple open-source annotation tools, some of them were complete, and some were too simple, while some were broken, and others provided data in a completely different format than what was required. So an interim data annotation tool was built using python and jupyter notebook.

Using the tool required at least above average understanding of python, and all of our company’s pythoneers were engaged in annotation while also trying out various annotation tools. Finally, we were able to find one that suited our needs. We deployed it to a Red-hat Cent-Os server using docker. Once the tool was up and running, annotation became a breeze, and we were producing properly annotated data at almost twice the speed of the original custom annotation tool. Once proper data became available in the right quantity and quality, extraction, conversion, and training came into play. We used an Nvidia 1060 8GB GPU to try out many different iterations when the perfect model that we were looking for all along surfaced!

Once we deployed our very first AI model, we gained the ability to extract data from context instead of matching strings. Given the computational expensiveness of regex, switching over to a model returned results in split seconds as compared to over 3-4 seconds of processing by regular expressions.

Memory usage reduced by over 30-40%. (The regular expressions were really complex, and sometimes, iterated through the same line almost 4-5 times to look for patterns. There were functions that would scan a line, go up to 5 lines above and below to find pattern matches).

This was my very first time deploying code on a Linux server. Though I have used Linux before, the absence of a GUI on the server scared me. And given that it was the client’s server where another application was already running, the room for errors was almost none. With great encouragement from my Boss and patience from the client, I deployed my very first API to the web.

Once that was done, the process of updating models wasn’t as complicated. And things started taking shape. As the training/testing data increased, our model started generalizing well, and we were able to get precision scores of more than 79% and are again improving!

Avatar photo
Great Learning Team
Great Learning's Blog covers the latest developments and innovations in technology that can be leveraged to build rewarding careers. You'll find career guides, tech tutorials and industry news to keep yourself updated with the fast-changing world of tech and business.

Leave a Comment

Your email address will not be published. Required fields are marked *

Great Learning Free Online Courses
Scroll to Top