Browse by Domains

Structuring Data To Draw Useful Insights in the Media and Entertainment Industry

Table of contents
HARSHIT PURI

I completed my Bachelor’s in Computer Applications from Swami Vivekanand Subharti University in 2019. After that, I started working as a Data Analyst in a US base project in a start-up. My area of work is maintaining a database of our clients and handling day-to-day operations. Having a tactical knowledge of how data works my passion is to explore opportunities. So, I explored different industry domains, such as E-Com, Fintech, Education, and Travel, and their application in the analytical field beyond the academic arena to become better decisions-driven data analysts. Currently working as Data Analyst at Editorji. Before joining as a Data Analyst in this organization, I worked as Vocational Trainer in a Delhi Government School a project sponsored by the state government.

Problem Statement: We used to obtain a lot of data that was unstructured and difficult to analyze to get inferences. My role was to structure the data for better analysis. We had to select the right methodology and analysis the data further to explain the result in the business context. These were the major problems faced at the workplace. Editorji largely depends on user engagement on the platform. Besides, changing methodology the user engagement on the platform is not stable. Although, it is not very easy to predict and do an analysis on user engagement as there are many outliers

in the data. Also, the database is in test mode right now. For example – Let me explain it with an example, Editorji is a Digital media news organization, we upload content on our platform every day. If content or news has been uploaded on Day 1 it might be possible for the user to view that news on the 100th day also, so predictions with the data are not possible. Data was not reliable to make hypotheses or predictions. So, insights to grow the application and our platform are not on point because the DB is in testing mode, Data is not structured, and schema cannot be made for the time being.

Tools and Techniques Used:

Step 1: As the database is not structured and schema is not there in the database. I tried to connect the MongoDB database with Jupyter. To get a glimpse of the data, as it is in testing mode I needed to test whether the data which is in the database can be evaluated or if there is some problem. I used python to solve the problem to represent the data in a structured form.

Step 2: Identification of relevant information from the structured data. This covered information such as views on the platform in a month and seeing whether there is an increase in engagement %.

Step 3: Using python, I also identified the number of users visiting the platform, the frequency of coming on the platform, clicking on the notifications/watching the videos (twice or thrice). An important thing to note is if a user watches the same video after a month, the outliers are pretty high in this case.

Step 4: The data were analyzed for unique users and how many were actually able to go through the media website and click on the notifications. Python helped me analyze the user traffic.

Insights: After connecting the Database with python. I found that our User Retention has increased on yearly basis. But also, the average rate of our installs is less than the average rate of uninstalls on yearly basis. The following were the important observations made:

1. How many users visit the platform

2. How many of them click twice/thrice

3. Predictions were difficult as there were many outliers

4. The testing team is able to draw inferences from the analysis done

Solution /Recommendations: 

The solution for the problem is to find a different way to increase our reachability PAN India. Making our platform SSP and DSP was the suggestion my team proposed. The important thing to note here is that the data was useful when unique users were considered for analysis.

Impact Generated: We witnessed an increase in the user retention rate after using Python for analysis. For Q1, it increased up to 13.4% and in Q2 it rose up to 19.5%. In addition to it, the application downloads also increased by 11.49% from June to July 2022. We will most probably be working with big bulls in the coming time as a Demand Side Platform (DSP). The work is in progress. This helped in my progress as a Data Analyst. Also, I am exploring other tools as well such as MongoDB compass, and Power BI. I understood how data which is not reliable can give probable insights.

Sandeep T

Leave a Comment

Your email address will not be published. Required fields are marked *

    Table of contents

Scroll to Top