Application Closes 1st May 2024

Get the program brochure

Check out the program and fee details in our brochure

Oops!! Something went wrong, Please try again.
Name
Email
Mobile Number

By submitting this form, you consent to our Terms of Use & Privacy Policy and to be contacted by us via Email/Call/Whatsapp/SMS.

PGP-Data Science & Analytics

PGP-Data Science & Analytics

Kickstart your career in Data Science | Learn In-demand Tools & Languages

Application closes 1st May 2024

  • Program Overview
  • Curriculum
  • Certificate
  • Tools
  • Success Stories
  • Faculty
  • Career Support
  • Fees

Key highlights of the Data science course

  • highlight-icon

    Industry-ready curriculum

  • highlight-icon

    10 years of excellence

  • highlight-icon

    200+ successful batches

  • highlight-icon

    1:1 mentorship

  • highlight-icon

    Dedicated Career Support

  • highlight-icon

    150+ hours of learning content

  • highlight-icon

    Certificate from Great Lakes Executive Learning

Skills you will learn

  • Python
  • Data Mining
  • Tableau
  • Machine Learning
  • SQL
  • ChatGPT

India's trusted education platform

  • 4.8

    star
  • 4.7

    star
  • 4.7

    star
  • 4.97

    star

Our alumni work at top companies

Curriculum

Unit 1

Data Science Foundations

Introduction to Data Science and AI (Self-Paced)

Gain an understanding of the evolution of Data Science over time, their application in industries, the mathematics and statistics behind them, and an overview of the life cycle of building data driven solutions

  • The fascinating history of Data Science
  • Transforming Industries through Data Science
  • The Math and Stats underlying the technology
  • Navigating the Data Science Lifecycle

Python for Data Science - 4 Weeks

Python Programming:

Python is a widely used, high-level, interpreted programming language, having a simple, easy-to-learn syntax that highlights code readability. This module will cover the fundamentals of Python programming and taking the first steps in organizing data with Python.

  • Variables and Datatypes
  • Data Structures
  • Conditional and Looping Statements
  • Functions

Python for Data Science:

NumPy is a Python package for mathematical and scientific computing and involves working with arrays and matrices. Pandas is a fast, powerful, flexible, and simple-to-use open-source library in Python to manipulate and analyse data. This module will cover these important libraries and provide a deep understanding of how to use them to explore data.

  • NumPy arrays and functions
  • Accessing and modifying NumPy arrays
  • Saving and loading NumPy arrays
  • Pandas Series (Creating, Accessing, and Modifying Series)
  • Pandas DataFrames (Creating, Accessing, Modifying, and Combining DataFrames)
  • Pandas Functions
  • Saving and loading datasets using Pandas

Python for Visualization:

Matplotlib is a library to create statically animated, interactive visualisations, whereas Seaborn is a Matplotlib based data visualisation library in Python.

This module will give you a deep understanding of exploring data sets using Matplotlib and Seaborn

  • Histogram, Boxplots and Bar graphs
  • Line Plot, Scatterplot, and Lmplot
  • Jointplot, Violin Plot, and striplot
  • Swarm, catplot, and pairplots
  • Heatmaps, Plotly, and Customizing of Plots

Exploratory Data Analysis (Deep Dive)

Exploratory Data Analysis, or EDA, is a process of examining and visualizing data to uncover patterns and extract meaningful insights from it and facilitates storytelling. This module provides a deep insight on how to conduct EDA using Python and utilize the insights extracted to drive business decisions.

  • Data overview
  • Univariate analysis
  • Bivariate/Multivariate analysis
  • Missing value treatment

Introduction to SQL - 4 Weeks

Querying Data With SQL

SQL is a widely used querying language for efficiently managing and manipulating relational databases. This module provides an essential foundation for understanding and working with relational databases. Participants will explore the fundamentals of setting up MySQL, including installation and configuration, gain insight into the principles of database management and Structured Query Language (SQL), and learn how to fetch and filter data using SQL queries, enabling them to extract valuable insights from large datasets efficiently.

  • Getting set up with MySQL
  • Introduction to DB and SQL
  • Fetching data in SQL
  • Filtering data in SQL

 

SQL In-Built Functions

SQL offers a wide range of numeric, string, and date functions, gaining proficiency in leveraging these functions to perform advanced calculations, string manipulations, and date operations. This module provides a comprehensive exploration of the various functions available within SQL for data manipulation and analysis. Additionally, participants will discover the significance of aggregating data using SQL functions, enabling them to summarize and analyze large datasets effectively.

  • Numeric Functions in SQL
  • String Functions in SQL
  • Date Functions in SQL
  • Aggregating data in SQL

 

Advanced Querying

SQL joins are used to combine data from multiple tables effectively and window functions enable performing complex analytical tasks such as ranking, partitioning, and aggregating data within specified windows. Subqueries allow one to nest queries within other queries. This module will equip participants with advanced techniques for querying and analyzing relational databases to extract and manipulate data dynamically.

  • Joins in SQL
  • Window functions in SQL
  • Subqueries

Unit 2

Data Science Techniques

Inferential Statistics - 4 Weeks

Inferential Statistics Foundations

Inferential statistics is pivotal in statistical analysis and decision-making and involves drawing conclusions about populations based on samples. This module will introduce learners to the common probability distributions and how they are used to make statistically-sound, data-driven decisions.

  • Experiments, Events, and Definition of Probability
  • Introduction to Inferential Statistics
  • Introduction to Probability Distributions (Random Variable, Discrete and Continuous Random Variables, Probability Distributions)
  • Binomial Distribution
  • Normal Distribution
  • z-score

Estimation and Hypothesis Testing

Estimation involves determining likely values for population parameters from sample data, while hypothesis testing provides a framework for drawing conclusions from sample data to the broader population. This module covers the important concepts of central limit theorem and estimation theory that as vital for statistical analysis, and the framework for conducting hypothesis tests.

  • Sampling
  • Central Limit Theorem
  • Estimation
  • Introduction to Hypothesis Testing (Null and Alternative hypothesis, Typ-I and Type-II errors, alpha, critical region, p-value)
  • Hypothesis Formulation and Performing a Hypothesis Test
  • One-tailed and Two-tailed Tests
  • Confidence Intervals and Hypothesis Testing

Common Statistical Tests

Hypothesis tests assess the validity of a claim or hypothesis about a population parameter through statistical analysis. This module introduces learners to the most commonly used hypothesis tests used in the world of Data Science and how to choose the right test for a given business claim depending on the associated context.

  • Common Statistical Tests
  • Test for one mean
  • Test for equality of means (known standard deviation)
  • Test for equality of means (Equal and unknown std dev)
  • Test for equality of means (Unequal and unknown std dev)
  • Test of independence
  • One-way ANOVA

Predictive Modeling - 5 Weeks

Intro to Supervised Learning - Linear Regression

Machine Learning (ML), a subset of Artificial Intelligence (AI), which focuses on developing algorithms capable of learning patterns in data and making predictions without being explicitly programmed to do so. Linear Regression is one of the most popular supervised ML algorithms that identifies the degree of linear relationship in data. This module introduces participants to ML and explores how linear regression can be used for predictive analysis.

  • Introduction to learning from data
  • Simple and Multiple Linear Regression
  • Evaluating a regression model
  • Pros and Cons of Linear Regression

Linear Regression Assumptions and Statistical Inference

The linear regression algorithm has a set of assumptions that need to be satisfied for the model to be statstically validated and to be able to draw inferences from it. This module walks participants through these assumptions, how to check them, what to do in case they are violated, and the statistical inferences that can be drawn based on the model's output.

  • Statistician vs ML Practitioner
  • Linear Regression Assumptions
  • Statistical Inferences from a Linear Regression Model

Machine Learning-1 - 3 Weeks

  • Logistic Regression

    Logistic regression is a statistical modeling technique primarily used for modeling the probability of binary outcomes and it finds applications in various fields such as medicine, finance, and manufacturing. This module covers the theory behind the logistic regression model, how to asses its performance, and how to draw statistical inferences from it.

  • Introduction to Logistic Regression
  • Interpretation from a Logistic Regression model
  • Changing the threshold of a Logistic Regression model
  • Evaluation of a classification model
  • Pros and Cons
  •  

    Naive-Bayes, KNN

    Bayes' Rule is an important topic in probabilistic reasoning and decision-making. Distance metrics offer a handy way of measuring similarity between data points. This module provides participants with a comprehensive understanding of the Bayes Rule and Naive Bayes algorithm, its assumptions, different distance metrics, the K-Nearest Neighbors (KNN) algorithm, and its practical applications in classification and regression tasks.

  • Bayes Rule
  • Naive Bayes Algorithm
  • Distance Metrics
  • KNN Algorithm
  • Decision Tree

    Decision trees are supervised ML algorithms that utilize a hierarchical structure for decision making and can be used for both classification and regression problems. This module dives into how a decison tree can be used to model complex, non-linear data and how to improve the performance of decision trees using pruning techniques.

  • Introduction to Decision Tree
  • How a Decision Tree is built
  • Methods of pruning a Decision Tree
  • Different impurity measures
  • Regression Trees
  • Pros and Cons

Machine Learning-2 - 4 Weeks

Bagging and Random Forest

Random forest is a popular ensemble learning technique that comprises of several decision trees, each using a subset of the data to understand patterns. The outputs of each tree are then aggregated to provide predictive performance. This module will explore how to train a random forest model to solve complex business problems.

  • Introduction to Ensemble Techniques
  • Bagging
  • Random Forests

Boosting:

Boosting models are robust ensemble models that comprise of several sub-models, each of which are developed in a sequential manner to improve upon the errors made by the previous one. This modules will cover essential boosting algorithms like Adaboost and XGBoost that are widely used in the industry for accurate and robust predictions.

  • Introduction to Boosting
  • Bagging VS Boosting
  • Different boosting techniques - AdaBoost, Gradient Boosting, XGBoost
  • Stacking

 

Model Tuning

Model tuning is a crucial step in developing ML models and focuses on improving the performance of a model using different techniques like feature engineering, imbalance handling, regularization, and hyperparameter tuning to tweak the data and the model. This module covers the different techniques to tune the performance of an ML model to make it robust and generalized.

  • K-fold cross validation
  • Oversampling and Undersampling
  • Regularization
  • Data Leakage
  • Hyperparameter Tuning
  • GridSearchCV and RandomizedSearchCV

Unsupervised Learning

K-Means Clustering

K-means clustering is a popular unsupervised ML algorithm that is used for identifying patterns in unlabeled data and grouping it. This module dive into the working of the algorithm and the important points to keep in mind when implementing it in practical scenarios.

  • Introduction to Clustering
  • Types of Clustering
  • K-means Clustering
  • Importance of Scaling
  • Silhouette Score
  • Visual Analysis of Clustering

 

Hierarchical Clustering and PCA

Hierarchical clustering organizes data into a tree-like structure of nested clusters, while dimensionality reduction techniques are used to transform data into a lower-dimensional space while retaining the most important information in it. This module covers the business applications of hierarchical clustering and how to reduce the dimension of data using PCA to aid in visualization and feature selection of multivariate datasets.

  • Hierarchical Clustering
  • Cophenetic Correlation
  • Introduction to Dimensionality Reduction
  • Principal Component Analysis

Unit 3

Visualization and insights

Data Visualization using Tableau (Self-Paced)

  • Introduction to Data Visualization
  • Introduction to Tableau
  • Basic Charts and Dashboards
  • Descriptive Statistics, Dimensions and Measures
  • Visual Analytics
  • Dashboard Design & Principles
  • Advanced Design Components/Principles
  • Special Chart Types
  • Case Study: Hands-On using Tableau
  • Integrate Tableau with Google Sheets

Unit 4

Capstone Project

You will get your hands dirty with a real-time project under industry experts’ guidance, this capstone project will last for 4 weeks where you will get to implement all your learnings from the Data Science foundations to Visualization and everything in between. Successful completion of the project will earn you a post-graduate certificate in data science and analytics.

Upskill from Great Lakes

Earn a PG certificate in Data Science & Analytics

Ranked among India's top 10 business schools, Great Lakes is highly regarded for its analytics programs. A certification from Great Lakes Executive Learning ensures industry credibility and acceptance, providing a robust foundation for your career advancement.

Great lakes certificate

* Image for illustration only. Certificate subject to change.

  • Top Standalone Institution

    Top Standalone Institution

    By Outlook India

  • In One Year Programs

    In One Year Programs

    By Business World

  • Top B-Schools

    Top B-Schools

    By Business India

Industry relevant syllabus

Learn top in-demand tools

Delve deep into Data Science with our program, mastering significant skills and employing powerful tools to fortify digital defenses.

  • tools-icon

    Python

  • tools-icon

    Tableau

  • tools-icon

    Knime

  • tools-icon

    NumPy

  • tools-icon

    SQL

  • tools-icon

    Pandas

  • tools-icon

    Seaborn

  • tools-icon

    Matplotlib

  • tools-icon

    Statsmodels

  • tools-icon

    Scikit-Learn

Watch stories of success of our learners

  • user image

    Sunita Gupta

    Post Graduate Program in Data Science

  • user image

    Harshavardhan M

    Post Graduate Program in Data Science

  • user image

    Abhay Ankit

    Post Graduate Program in Data Science

Our mentors

Introducing our dedicated mentors and experienced industry insiders devoted to guiding learners on their Data Science career journey.

Advanced Career Support

  • banner-image

    1:1 CAREER SESSIONS

    Engage one-on-one with industry experts for valuable insights and guidance.

  • banner-image

    INTERVIEW PREPARATION

    Gain insights into Recruiter Expectations.

  • banner-image

    RESUME & LINKEDIN PROFILE REVIEW

    Showcase your Strengths Impressively

  • banner-image

    E-PORTFOLIO

    Create a Professional Portfolio Demonstrating Skills and Expertise

Program Fees

account-balance-icon

Pay in Installments

Recommended

Low Cost EMI at ₹ 2,851/month

VIEW ALL PLANS

Payment Partners

propelld liquiloans eduvanz-logo

Benefits of learning from us

  • 150+ hours of online content
  • Personalised mentorship sessions
  • Dedicated career support
  • 8+ languages & tools
  • Doubt-Solving with Expert Industry mentors
  • Proactive Program Support
  • Certificate of completion from Great Lakes

Application process

Our admissions close once the requisite number of participants enroll for the upcoming batch . Apply early to secure your seats.

  • steps icon

    1. Fill the application form

    Apply by filling a simple online application form.

  • steps icon

    2. Interview Process

    Go through a screening call with the Admission Director’s office.

  • steps icon

    3. Join program

    An offer letter will be rolled out to the select few candidates. Secure your seat by paying the admission fee.

banner image
EARN A PRESTIGIOUS

Post-Graduate Program in Data science & Analytics

You can also reach out to us at pgpdsba@greatlearning.in or 080-6947-4555.

Still have queries?
Contact Us

Application Closes 1st May 2024

Download Brochure

Check out the program and fee details in our brochure

Oops!! Something went wrong, Please try again.
Name
Email
Mobile Number

By submitting this form, you consent to our Terms of Use & Privacy Policy and to be contacted by us via Email/Call/Whatsapp/SMS.

Phone Icon

We are allocating a suitable domain expert to help you out with your queries. Expect to receive a call in the next 4 hours.