Segregating Data Efficiently Using ANOVA and Hypothesis Techniques

Segregating Data Efficiently Using ANOVA and Hypothesis Techniques

Hi, I am Trishala, and I am currently based in Bangalore. I predominantly work in the biopharmaceutical sector. I am part of the R&D unit at Syngene International, one of the leading Contract Research and Manufacturing Organization (CRMO) in India. The scope of my current work includes Research and Development activities with respect to drugs which are used for Cancer treatments. Since we have various clients from all over the World, we manage different portfolios and work in sync with them. Practically speaking, graduates with analytics skills will be able to command higher salaries and enjoy their pick of the available jobs. Apart from the personal benefit, being from a CRMO (Contract Research and manufacturing organisation) I think analytics will enable me to process and interpret a huge amount of data so that it can be utilized for various research programs. The ability to think analytically and approach problems in the right way is a skill that's always useful, not just in the professional world, but in everyday life as well, isn’t it?

Since we have multiple clients with a great amount of data with regard to them, there are safety issues we need to care about. These include taking care of the data and having them aligned and in simpler forms for the client’s understanding. When the data is large it becomes very tiresome to segregate them into a manageable and uncomplicated form. Therefore, data science helps us to do it in a simpler and more efficient way. Apart from that there is another issue concerning the outcome of a certain assessment or experiment that we run. Even if there are former data with reference to some similar experiment which was run and can be referred to, we cannot do that as the data isn’t readily accessible, or gets shrouded under multiple other assignments. The complications which I have mentioned were affecting time management, as instead of actually doing active work we were spending more time segregating the data and getting them aligned more for our or the Team’s understanding. It was time-consuming and was not adding value. Apart from that the issue with past data getting hidden under unwanted paperwork was setting us back and hindering cost management.

In any experimental research, the effect of the variable factors is the key player. The big dataset generated in these experiments can be analysed using Analysis of variance (ANOVA) where one can compare two different variables and how different they are from each other. ANOVA validates the impact of two or more factors by comparing the means of different data sets. I also used a hypothesis. Since data generated is a result of several experimental outputs, these outputs can be predicted based on the former data by building a matrix. My decision was completely based on the grounds of the means to efficiently get my work done and minimize and if possible, completely obliterate the time spent on doing activities which do not add value to the organization or my individual tasks. A few key factors which were appraised me for selecting the techniques would be;

1. Skilful time management
2. Swift isolation and separation of data
3. Possessing unambiguous and comprehensible data sets
4. Adept cost management techniques
5. Accelerated data recovery

The key steps implemented in the problem solution were:

1. Identifying the problem: It was a challenge to identify the origin of the problem and where it started before, I could address it.
2. Using the correct tool: The real challenge in selecting the tool was whether the tool can be agile, scalable, and can combine multiple sources of complex data, and analyse it.

In a toxicological study conducted in my organization Chi-square and t-test modules were used. The objective of this study was to assess the effect of a single drug over multiple formulations. The four groups used for this study were Positive control, Low dose, Mid dose and High dose. With this particular study, we were able to devise which group had a better effect on the drug in turn defining higher efficacy of the same. My suggestion was to conduct a brainstorming session with all the Operational Unit heads to check whether these tools can be used across the organization. The impact was quite significant as it aided in bringing down the overall budget of numerous projects substantially by 1% which in turn led to the expansion of the revenue. Additionally, it also brought efficiency, coherence and eminence to the work we were doing.

Data Science and Business Analytics are all about solving problems, and they gave me an overall understanding of how it can be beneficial for my current organisation furthermore how I can use it for my future organisations.