Browse by Domains

Introduction to Linear Programming for Data Science

Contributed by: Sarita Upadhya
LinkedIn Profile: https://www.linkedin.com/in/sarita-upadhya-a7b94922/

Introduction

Data Science helps businesses to make informed and data-driven decisions. Current business situations or business scenarios can be explored further using available data, and better insights can be drawn using Descriptive Analytics. Using past data, we can predict the future. Also, the cause and effect relationship between predictors and response variables can be extracted using Predictive Analytics. Predictive models are also used to simulate a future probable outcome by changing and assigning values of the predictors. As a part of Prescriptive Analytics, business starts looking at the different alternatives available to finalise the decision. Along with the available insights, resource constraints are taken into account to get an optimum solution.

Linear Programming (LP) in Operations Research is one of the scientific techniques that is used to get an optimum solution to the given business problem by taking resource scarcity and constraints into account. When decision making pertains to a continuous variable like sales, profit, cost etc. which has a linear relationship with several input variables, Linear Programming is applied to take into account the limitations or constraints associated, and arrive at the best solution. For e.g. if a business wants to maximise sales, there may be a constraint in terms of the quantum of product available for sale; the business may want to maximise the profit by increasing the production however may have a constraint in terms of the capacity of machines etc. LP provides an approach to the above problems to get an optimum solution by considering the constraints.

Read More – What is Data Science?

LP Application Requirements

In order to apply Linear Programming for Data Science, the below requirements have to be met:

  1. We should be able to define the objective or the aim clearly in mathematical terms. 
  2. The input variables that determine the objective should be distinct and quantitative. 
  3. Limitations or constraints that have to be considered should be quantitative and measurable.
  4. Relationship between the objective and the input variables has to be linear in nature.

Formulation of LP Problems

Below are the steps to be followed to formulate LP Problem:

  1. Identify the decision variables that are used to formulate and achieve the objective function
  2. Define the objective function: It defines the business objective in mathematical form
  3. List out the constraints or limitations using the decision variables in mathematical form
  4. Identify and list out the non-negative constraints 

Let’s understand this with a few examples. Also, we will look at the 3 tools (Graphical method, Excel Solver, R) which can be used to solve LP problems.

Linear Programming Examples

Here are the Linear Programming Examples:

Problem Statement 1

The XYZ company during the festival season combines two factors A and B to form a gift pack which must weigh 5 kg. At least 2 kg of A and not more than 4 kg of B should be used. The net profit contribution to the company is Rs. 5 per kg for A and Rs. 6 per kg for B. Formulate LP Model to find the optimal factor mix.

LP Formulation:

Step 1: Objective Function

In the above problem, the objective of the company is to maximise the profit. We are given the net profit contribution for factor A and B.

  • Let x kg of factor A be used
  • Let y kg of factor B be used
  • Objective Function ⬄ maximise z = 5x + 6y

Note: x, y are decision variables and z is the objective function

Step 2: Formulate the constraints

Weight of gift pack has to be 5 kg

  • x + y = 5

At least 2 kg of A and not more than 4 kg of B has to be used

  • x >= 2
  • y <= 4

Step 3: Non-negative constraints

The quantity used for A and B has to be positive

  • x, y >= 0

Summarise the LP Problem:

Objective: Maximise z = 5x+6y given the constraints

  1. x + y = 5
  2. x >= 2
  3. y <= 4
  4. x, y >= 0

Let’s try and solve the above LP problem graphically

Graphical Solution:

Let us consider a 2-dimension graph with x and y-axis:

  • Plot the constraints formulated in the above problem.
    • x + y = 5 is a line that cuts x-axis at (5,0) and y-axis at (0,5). As we have a “=” constraint, the feasible region lies on this line.
    • x >= 2 is the line that cuts the x-axis at (2,0). As we have a “>=” constraint, the feasible region lies to the right of this line
    • y <= 4 is the line that cuts the y-axis at (0,4). As we have a “<=” constraint, the feasible region lies to the bottom of this line
    • As x, y >= 0 we have considered only the positive side of the x and y-axis.
  • Combining all the above constraints, from the above graph the line marked in yellow is the final feasible region where all constraints are satisfied. 
  • The 2 extreme points on this line are the possible options for consideration to maximise the profit i.e. point A (2,3) and point B (5,0). 
  • Let us substitute these values in the objective function to find out which option gives higher profit.
    • Option A (2,3): z = 5x + 6y ⬄ z = (5*2) +(6*3) = 28
    • Option B (5,0): z = 5x + 6y ⬄ z = (5*5) +(6*0) = 25
  • Hence, the optimum solution is Option A i.e. 2kg of factor A and 3kg of factor B to be considered for the gift pack.

Problem Statement 2

A call centre requires a different number of full-time employees on different days of the week. The number of full-time employees required each day is given in the table below. Employment rules state that each full-time employee must work five days consecutively and then receive two days off. The call centre wants to meet its daily requirements using only full-time employees. Formulate an LP that the call centre can use to minimise the number of full-time employees who must be hired.

Days of the Week# of full-time employees required
1=Monday17
2=Tuesday13
3=Wednesday15
4=Thursday19
5=Friday14
6=Saturday16
7=Sunday11

LP Formulation:

Step 1: Objective Function

In the above problem, the objective is to minimise the number of full-time employees to hire.

  • Let x1 be number of employees beginning shifts from Monday
  • Let x2 be number of employees beginning shifts from Tuesday
  • Let x3 be number of employees beginning shifts from Wednesday
  • Let x4 be number of employees beginning shifts from Thursday
  • Let x5 be number of employees beginning shifts from Friday
  • Let x6 be number of employees beginning shifts from Saturday
  • Let x7 be number of employees beginning shifts from Sunday
  • Objective Function ⬄ minimise z = x1+x2+x3+x4+x5+x6+x7

Note: x1, x2…x7 are decision variables and z is the objective function

Step 2: Formulate the constraints

From the above table, we know the number of full-time employees required each day. Constraint to be considered is that a full-time employee should work for 5 consecutive days and have 2 days off. 

  • x1+x4+x5+x6+x7 >=17 (On Monday, employees working on Tuesday & Wednesday get off)
  • x1+x2+x5+x6+x7 >=13 (On Tuesday, employees working on Wednesday & Thursday get off)
  • x1+x2+x3+x6+x7 >=15 (On Wednesday, employees working on Thursday & Friday get off)
  • x1+x2+x3+x4+x7 >=19 (On Thursday, employees working on Friday & Saturday get off)
  • x1+x2+x3+x4+x5 >=14 (On Friday, employees working on Saturday & Sunday get off)
  • x2+x3+x4+x5+x6 >=16 (On Saturday, employees working on Sunday & Monday get off)
  • x3+x4+x5+x6+x7 <=11 (On Sunday, employees working on Monday & Tuesday get off)

Step 3: Non-negative constraints

Number of employees working each day has to be 0 or greater than 0, hence 

  • x1, x2, x3, x4, x5, x6, x7 >= 0

Summarise the LP Problem:

Objective: Minimise z = x1+x2+x3+x4+x5+x6+x7 subject to constraints

  1. x1+x4+x5+x6+x7 >=17 
  2. x1+x2+x5+x6+x7 >=13 
  3. x1+x2+x3+x6+x7 >=15 
  4. x1+x2+x3+x4+x7 >=19 
  5. x1+x2+x3+x4+x5 >=14 
  6. x2+x3+x4+x5+x6 >=16 
  7. x3+x4+x5+x6+x7 >=11
  8. x1, x2, x3, x4, x5, x6, x7 >= 0

As we have multiple decision variables, solving this graphically will be a tedious job. Let us try to solve the above problem using Excel Solver

Solution using Excel Solver:

The solver is a mathematical program or software that takes required input, applies the selected algorithm and provides a mathematical solution to the problem.

Ensure you have Solver add-in available in the Data tab of your Excel Workbook as shown in the figure below:

Step 1: List Constraints in Excel Sheet 

  • Fig 3 shows how the first 7 constraints have to be updated in the excel
  • The coefficients of each constraint are updated below the respective variables

Step 2: Include the Objective Function

  • Fig 4 shows how the objective function has to be included
  • Similar to the constraints include the coefficients of the objective function below the respective variables

Step 3: Include a row to get the best estimate for our decision variables

  • Row 11 in the above figures is the row which will give us the best estimate 
  • Initialise it to 0
  • Once the solver finds the solution the row will be updated with best estimates

Step 4: Include sumproduct() of each constraint with output row (Row 11) to formulate constraints 

  • Observe the “fx” cell at the top in Fig6
  • Column H has the sumproduct() of each row with the best estimate row i.e. row 11
  • Cells marked in green will display the values of each constraint after solver finds a solution
  • The cell marked in red is the output of the objective function i.e. the optimum output will be displayed once the solution is found

Step 5: Invoke Solver and add the constraints

  • From the figure above, observe the cell inputs to be given to the Solver
  • Select “Simplex LP” as the “Solving Method”
  • Click “Solve”

Step 6: Get the optimum solution

Click on ‘Solve’ to get the optimum solution

  • From the above image, we see that the optimum solution is to get 23 full-time employees for the job considering the given constraints (cell H10)
  • Values in Row 11 which are marked in yellow are the best estimates for each decision variable
  • Values in green confirm to us that the constraints have been satisfied

Problem Statement 3

John has a diet chart which gives calories, protein, carbohydrate and fat content for 4 food items. John wants a diet with minimum cost. The diet chart is as follows:


Item1Item2Item3Item4Required
Calories (gms)400200150500500
Proteins (gms)32006
Carbohydrates (gms)224410
Fat (gms)24158
Cost 0.5$0.2$0.3$0.8$

LP Formulation:

Step 1: Objective Function

In the above problem, the objective is to buy food items at minimum cost but meeting the dietary needs.

  • Let x1 be number of Item1 to be bought
  • Let x2 be number of Item2 to be bought
  • Let x3 be number of Item3 to be bought
  • Let x4 be number of Item4 to be bought
  • Objective Function ⬄ minimise z = 0.5×1+0.2×2+0.3×3+0.8×4

Note: x1, x2, x3, x4 are decision variables and z is the objective function

Step 2: Formulate the constraints

Calories required 500

  • 400×1 + 200×2+ 150×3+ 500×4 >= 500

Proteins required 6

  • 3×1 + 2×2 >= 6

Carbohydrates required 10

  • 2×1 + 2×2 +4×3 + 4×4 >= 10

Fat required 8

  • 2×1 + 4×2 +x3 + 5×4 >= 8

Step 3: Non negative constraints

Items bought have to be positive

  • x1, x2, x3, x4 >= 0

Summarise the LP Problem:

Objective: Minimise z = 0.5×1+0.2×2+0.3×3+0.8×4 given the constraints

  1. 400×1 + 200×2+ 150×3+ 500×4 >= 500
  2. 3×1 + 2×2 >= 6
  3. 2×1 + 2×2 +4×3 + 4×4 >= 10
  4. 2×1 + 4×2 +x3 + 5×4 >= 8
  5. x1, x2, x3, x4 >= 0

Solution using R: 

R is a programming language which has a package and relevant function defined to solve the LP problem. The package name is ‘lpSolve’ and the program to solve the above problem is given below:

install.packages(“IpSolve”) library(IpSolve)
#Setting the objective function.
objective.in=c(0.5,0.2,0.3,0.4)
#Constraint Matrix
const.mat-matrix(c(400,200,150,500,3,2,0,0,2,2,4,4,2,4,1,5), nrow = 4, byrow = T)
#defining constraints
const_1=500
const_2=6
const_3=10
const_4=8
#RHS for constraints
const.rhs c(const 1,const_2,const_3,const_4)
#Direction for constraints
const.dir=c(“>=”, “>=”,”>=”,”>=”)
#Finding the optimum solution
opt=lp(direction = “min”,objective.in,const.mat,const.dir,const.rhs)
#Objective values of x1,x2,x3 and x4
opt$solution
#Value of objective function at optimal point
optSobival

From the above we have got the best estimates, 3 – Item2 and 1- Item3 has to be bought to get required nutrients at a minimum cost of 0.9$

Advantages of Linear Programming

  • LP is useful for the business as the decision-maker can obtain an optimum solution by considering the effective use of scarce resources
  • It is a structured technique and helps in making informed data-driven decisions
  • It provides alternate solutions which the decision-maker can analyse further and finalise based on subjective matters that also need to be considered
  • LP can also be used for changing situations. Changed constraints or additional constraints can be included in the model to get revised output.

Application of Linear Programming

  • LP is widely used in Production to decide on Production mix, in Production Planning, Assembly line balancing etc.
  • Oil industries use LP to get optimum quality of oil by blending the right proportion of different crude oils, octane etc.
  • In finance, decisions have to be taken on where the money should be spent and from where the company needs to get the money from to ensure that they maximise the returns keeping the risks under acceptable control. Buying and selling bonds, managing corporate finances, making financial decisions.
  • Media selection, plant location, reducing travelling salesman’s cost are some of the areas where LP is used in the sales and marketing space.
  • In call-centres and hospitals, LP is used for scheduling the shifts for employees.
  • Defence makes use of LP to decide on the optimum level of force deployment to a particular place considering the different situations and constraints.

If you wish to learn more concepts in Data Science for a successful career in the domain, sign up for our data science online course and upskill today.

Reference Books

  • ‘Quantitative Techniques for Decision Making in Business’ by Trueman, R.E. Half Saunders, New York, 1981 
  • ‘Quantitative Techniques for Decision Making’ by Anand Sharma (IIT Delhi)

Find Data Science Course in Top Cities in India

Chennai | Bangalore | Hyderabad | Pune | Mumbai | Delhi NCR
Avatar photo
Great Learning Team
Great Learning's Blog covers the latest developments and innovations in technology that can be leveraged to build rewarding careers. You'll find career guides, tech tutorials and industry news to keep yourself updated with the fast-changing world of tech and business.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top