Every time I think of an animal that mirrors my spirit, I think of cats. Big cats, domesticated cats, but cats! So one day while writing about the skillset of a data scientist, I wondered about the animal that would best represent their spirit. What followed was a Google search and minutes later, I was fixated at this picture of a polar bear. I knew my answer. Now, you may say that I have run out of analogies. But before you dismiss my early morning ramblings, hang on a little longer, and you will see value in my attempt at this seemingly-ridiculous comparison.
The Arctic ice presents an enormous amount of data to a polar bear that it processes to hunt, protect, mate, and most importantly survive. Similarly, a data scientist remains unperturbed in the face of a million rows and columns of data. It is amazing how the characteristics of one can apply to another. Big Data has usually been defined by its 5 Vs. Let’s look at how data scientists and polar bears use these to their advantage:

  1. Volume is the first real parameter. The sheer amount of data generated every second is no joke. According to best-selling author and leading data expert Bernard Marr, “If we take all the data generated in the world between the beginning of time and the year 2008, it is the same amount we now generate every minute!” This requires sophistication to both store and analyze data that traditional database technologies fail to do. But with big data technologies, data scientists structure and analyze this data to draw insights and add value to businesses. For a polar bear, this is very similar to studying their environment. Females study for maternity dens, feeding grounds, and securing the vicinity against potential threats to cubs. Polar bears are marine mammals, born on land but they spend most of their time at sea or on ice. With double the size of a Siberian tiger, camouflaging across stretches of ice without dropping a hint of their presence is a mammoth task for these solitary scavengers.
  2. Velocity – A polar bear is under tremendous pressure to feed itself due to the growing scarcity of food and habitat loss in the Arctic. If they don’t assail in time, they can easily be outrun by land animals nor can they outswim the sea animals. But they still manage to be the top predators in their regional habitat given their ability to think and move quickly. Their average speed may just be 5/6 km per hour but they can sprint up to 40km an hour if required. Similarly, data scientists build models to analyze a trillion sets of data in real time for predicting trends, aiding e-commerce and retail businesses, detecting fraudulent activities, and making stock investments. They sieve through structured and unstructured data being generated at a great velocity in the form of text, images, videos, and sensor data to find trends, patterns, and valuable insights.
  3. Variability – The data generated in this era is not only unstructured but also dynamic. It is constantly changing in real-time. Tweets, images, and videos can go viral within a span of minutes. Variability especially becomes an important factor where words constitute the data because of its ever-changing definition with respect to the context and timing. IBM explains this with a healthcare example: “a diagnosis of “CP” may mean chest pain when entered by a cardiologist or primary care physician but may mean “cerebral palsy” when entered by a neurologist or paediatrician. Because true interoperability is still somewhat elusive in healthcare data, variability remains a constant challenge.” Another example is how a data scientist has to model and run algorithms to discard inaccurate data in sentiment analysis. A polar bear is also sadly adapting to the ramifications of climate change feeding on alternate food sources such as birds, crabs, snow geese, and rodents. While it may have evolved to adapt to the coldest environment on the planet with its two-layers of fur and paw pads that provide traction in movement on ice, a polar bear is sadly struggling with this particular V unlike a data scientist.
  4. Visualization –According to NatGeo, a polar bear’s sense of smell is extraordinary; they can smell a seal or human presence from up to a mile away or from under 3 feet of ice. Their vision and hearing ability is as good as humans. Polar bears can visualize potential prey, threats, etc. with their excellent physical senses. A data scientist uses visualization tools in the same way with Tableau, QlikView, Plotly, or Sisense that he/she uses to communicate findings and breaking down insights into actionable steps for stakeholders. To make others see value in what they see, data scientists must learn the principles of visualization to present compelling data. A data scientist’s findings are useless if the teams can’t act upon it.
  5. Value – This is probably the most important V of all for a data scientist as their effort is in vain if they are unable to translate their analysis into business value. This requires an advanced understanding of how the business works and where to look for solutions. Contrary to the ‘intuitive versus data-driven’ decision making debate, data scientists do use intuition to arrive at the right approach to solve a problem. They define and solve a business problem by going through volumes of data just as a polar bear scrutinizes its environment looking for seal resting grounds, and birthing lairs to attack. A polar bear can stay still and alert for 18 hours at a stretch near a seal’s breathing hole to pounce at the right moment. Unlike Grizzly and Kodiak bears, they never hibernate!

Do you now agree that a polar bear could be the spirit animal for a data scientist? Both are individualistic, intelligent, and inquisitive beings. But unfortunately, the similarities end here. What remains is an opposite trend with the population of polar bears diminishing while the data scientists are thriving. The ice beneath the polar bears is shrinking fast leading to existential challenges, courtesy climate change. But as far as data scientists are concerned, there couldn’t be a better time than now. Data scientists will enjoy the dawn of the new digital (smart) era and more with higher packages, better job profiles, exciting challenges, faster growth as they set up newer teams in big and small enterprises.
Great Learning big data and machine learning course



Please enter your comment!
Please enter your name here

11 − 8 =