Introduction to Linear Programming for Data Science

Contributed by: Sarita Upadhya
LinkedIn Profile: https://www.linkedin.com/in/sarita-upadhya-a7b94922/

Introduction

Data Science helps businesses to make informed and data-driven decisions. Current business situations or business scenarios can be explored further using available data, and better insights can be drawn using Descriptive Analytics. Using past data, we can predict the future. Also, the cause and effect relationship between predictors and response variables can be extracted using Predictive Analytics. Predictive models are also used to simulate a future probable outcome by changing and assigning values of the predictors. As a part of Prescriptive Analytics, business starts looking at the different alternatives available to finalise the decision. Along with the available insights, resource constraints are taken into account to get an optimum solution.

Linear Programming (LP) in Operations Research is one of the scientific techniques that is used to get an optimum solution to the given business problem by taking resource scarcity and constraints into account. When decision making pertains to a continuous variable like sales, profit, cost etc. which has a linear relationship with several input variables, Linear Programming is applied to take into account the limitations or constraints associated, and arrive at the best solution. For e.g. if a business wants to maximise sales, there may be a constraint in terms of the quantum of product available for sale; the business may want to maximise the profit by increasing the production however may have a constraint in terms of the capacity of machines etc. LP provides an approach to the above problems to get an optimum solution by considering the constraints.

LP Application Requirements

In order to apply Linear Programming for Data Science, the below requirements have to be met:

We should be able to define the objective or the aim clearly in mathematical terms.
The input variables that determine the objective should be distinct and quantitative.
Limitations or constraints that have to be considered should be quantitative and measurable.
Relationship between the objective and the input variables has to be linear in nature.

Formulation of LP Problems

Below are the steps to be followed to formulate LP Problem:

Identify the decision variables that are used to formulate and achieve the objective function
Define the objective function: It defines the business objective in mathematical form
List out the constraints or limitations using the decision variables in mathematical form
Identify and list out the non-negative constraints

Let’s understand this with a few examples. Also, we will look at the 3 tools (Graphical method, Excel Solver, R) which can be used to solve LP problems.

Linear Programming Examples

Here are the Linear Programming Examples:

Problem Statement 1

The XYZ company during the festival season combines two factors A and B to form a gift pack which must weigh 5 kg. At least 2 kg of A and not more than 4 kg of B should be used. The net profit contribution to the company is Rs. 5 per kg for A and Rs. 6 per kg for B. Formulate LP Model to find the optimal factor mix.

LP Formulation:

Step 1: Objective Function

In the above problem, the objective of the company is to maximise the profit. We are given the net profit contribution for factor A and B.

Let x kg of factor A be used
Let y kg of factor B be used
Objective Function ⬄ maximise z = 5x + 6y

Note: x, y are decision variables and z is the objective function

Step 2: Formulate the constraints

Weight of gift pack has to be 5 kg

x + y = 5

At least 2 kg of A and not more than 4 kg of B has to be used

x >= 2
y <= 4

Step 3: Non-negative constraints

The quantity used for A and B has to be positive

x, y >= 0

Summarise the LP Problem:

Objective: Maximise z = 5x+6y given the constraints

x + y = 5
x >= 2
y <= 4
x, y >= 0

Let’s try and solve the above LP problem graphically

Graphical Solution:

Let us consider a 2-dimension graph with x and y-axis:

Plot the constraints formulated in the above problem.
- x + y = 5 is a line that cuts x-axis at (5,0) and y-axis at (0,5). As we have a “=” constraint, the feasible region lies on this line.
- x >= 2 is the line that cuts the x-axis at (2,0). As we have a “>=” constraint, the feasible region lies to the right of this line
- y <= 4 is the line that cuts the y-axis at (0,4). As we have a “<=” constraint, the feasible region lies to the bottom of this line
- As x, y >= 0 we have considered only the positive side of the x and y-axis.
Combining all the above constraints, from the above graph the line marked in yellow is the final feasible region where all constraints are satisfied.
The 2 extreme points on this line are the possible options for consideration to maximise the profit i.e. point A (2,3) and point B (5,0).
Let us substitute these values in the objective function to find out which option gives higher profit.
- Option A (2,3): z = 5x + 6y ⬄ z = (5*2) +(6*3) = 28
- Option B (5,0): z = 5x + 6y ⬄ z = (5*5) +(6*0) = 25
Hence, the optimum solution is Option A i.e. 2kg of factor A and 3kg of factor B to be considered for the gift pack.

Problem Statement 2

A call centre requires a different number of full-time employees on different days of the week. The number of full-time employees required each day is given in the table below. Employment rules state that each full-time employee must work five days consecutively and then receive two days off. The call centre wants to meet its daily requirements using only full-time employees. Formulate an LP that the call centre can use to minimise the number of full-time employees who must be hired.

Days of the Week	# of full-time employees required
1=Monday	17
2=Tuesday	13
3=Wednesday	15
4=Thursday	19
5=Friday	14
6=Saturday	16
7=Sunday	11

LP Formulation:

Step 1: Objective Function

In the above problem, the objective is to minimise the number of full-time employees to hire.

Let x1 be number of employees beginning shifts from Monday
Let x2 be number of employees beginning shifts from Tuesday
Let x3 be number of employees beginning shifts from Wednesday
Let x4 be number of employees beginning shifts from Thursday
Let x5 be number of employees beginning shifts from Friday
Let x6 be number of employees beginning shifts from Saturday
Let x7 be number of employees beginning shifts from Sunday
Objective Function ⬄ minimise z = x1+x2+x3+x4+x5+x6+x7

Note: x1, x2…x7 are decision variables and z is the objective function

Step 2: Formulate the constraints

From the above table, we know the number of full-time employees required each day. Constraint to be considered is that a full-time employee should work for 5 consecutive days and have 2 days off.

x1+x4+x5+x6+x7 >=17 (On Monday, employees working on Tuesday & Wednesday get off)
x1+x2+x5+x6+x7 >=13 (On Tuesday, employees working on Wednesday & Thursday get off)
x1+x2+x3+x6+x7 >=15 (On Wednesday, employees working on Thursday & Friday get off)
x1+x2+x3+x4+x7 >=19 (On Thursday, employees working on Friday & Saturday get off)
x1+x2+x3+x4+x5 >=14 (On Friday, employees working on Saturday & Sunday get off)
x2+x3+x4+x5+x6 >=16 (On Saturday, employees working on Sunday & Monday get off)
x3+x4+x5+x6+x7 <=11 (On Sunday, employees working on Monday & Tuesday get off)

Step 3: Non-negative constraints

Number of employees working each day has to be 0 or greater than 0, hence

x1, x2, x3, x4, x5, x6, x7 >= 0

Summarise the LP Problem:

Objective: Minimise z = x1+x2+x3+x4+x5+x6+x7 subject to constraints

x1+x4+x5+x6+x7 >=17
x1+x2+x5+x6+x7 >=13
x1+x2+x3+x6+x7 >=15
x1+x2+x3+x4+x7 >=19
x1+x2+x3+x4+x5 >=14
x2+x3+x4+x5+x6 >=16
x3+x4+x5+x6+x7 >=11
x1, x2, x3, x4, x5, x6, x7 >= 0

As we have multiple decision variables, solving this graphically will be a tedious job. Let us try to solve the above problem using Excel Solver

Solution using Excel Solver:

The solver is a mathematical program or software that takes required input, applies the selected algorithm and provides a mathematical solution to the problem.

Ensure you have Solver add-in available in the Data tab of your Excel Workbook as shown in the figure below:

Step 1: List Constraints in Excel Sheet

Fig 3 shows how the first 7 constraints have to be updated in the excel
The coefficients of each constraint are updated below the respective variables

Step 2: Include the Objective Function

Fig 4 shows how the objective function has to be included
Similar to the constraints include the coefficients of the objective function below the respective variables

Step 3: Include a row to get the best estimate for our decision variables

Row 11 in the above figures is the row which will give us the best estimate
Initialise it to 0
Once the solver finds the solution the row will be updated with best estimates

Step 4: Include sumproduct() of each constraint with output row (Row 11) to formulate constraints

Observe the “fx” cell at the top in Fig6
Column H has the sumproduct() of each row with the best estimate row i.e. row 11
Cells marked in green will display the values of each constraint after solver finds a solution
The cell marked in red is the output of the objective function i.e. the optimum output will be displayed once the solution is found

Step 5: Invoke Solver and add the constraints

From the figure above, observe the cell inputs to be given to the Solver
Select “Simplex LP” as the “Solving Method”
Click “Solve”

Step 6: Get the optimum solution

Click on ‘Solve’ to get the optimum solution

From the above image, we see that the optimum solution is to get 23 full-time employees for the job considering the given constraints (cell H10)
Values in Row 11 which are marked in yellow are the best estimates for each decision variable
Values in green confirm to us that the constraints have been satisfied

Problem Statement 3

John has a diet chart which gives calories, protein, carbohydrate and fat content for 4 food items. John wants a diet with minimum cost. The diet chart is as follows:

	Item1	Item2	Item3	Item4	Required
Calories (gms)	400	200	150	500	500
Proteins (gms)	3	2	0	0	6
Carbohydrates (gms)	2	2	4	4	10
Fat (gms)	2	4	1	5	8
Cost	0.5$	0.2$	0.3$	0.8$

LP Formulation:

Step 1: Objective Function

In the above problem, the objective is to buy food items at minimum cost but meeting the dietary needs.

Let x1 be number of Item1 to be bought
Let x2 be number of Item2 to be bought
Let x3 be number of Item3 to be bought
Let x4 be number of Item4 to be bought
Objective Function ⬄ minimise z = 0.5×1+0.2×2+0.3×3+0.8×4

Note: x1, x2, x3, x4 are decision variables and z is the objective function

Step 2: Formulate the constraints

Calories required 500

400×1 + 200×2+ 150×3+ 500×4 >= 500

Proteins required 6

3×1 + 2×2 >= 6

Carbohydrates required 10

2×1 + 2×2 +4×3 + 4×4 >= 10

Fat required 8

2×1 + 4×2 +x3 + 5×4 >= 8

Step 3: Non negative constraints

Items bought have to be positive

x1, x2, x3, x4 >= 0

Summarise the LP Problem:

Objective: Minimise z = 0.5×1+0.2×2+0.3×3+0.8×4 given the constraints

400×1 + 200×2+ 150×3+ 500×4 >= 500
3×1 + 2×2 >= 6
2×1 + 2×2 +4×3 + 4×4 >= 10
2×1 + 4×2 +x3 + 5×4 >= 8
x1, x2, x3, x4 >= 0

Solution using R:

R is a programming language which has a package and relevant function defined to solve the LP problem. The package name is ‘lpSolve’ and the program to solve the above problem is given below:

install.packages(“IpSolve”) library(IpSolve)
#Setting the objective function.
objective.in=c(0.5,0.2,0.3,0.4)
#Constraint Matrix
const.mat-matrix(c(400,200,150,500,3,2,0,0,2,2,4,4,2,4,1,5), nrow = 4, byrow = T)
#defining constraints
const_1=500
const_2=6
const_3=10
const_4=8
#RHS for constraints
const.rhs c(const 1,const_2,const_3,const_4)
#Direction for constraints
const.dir=c(“>=”, “>=”,”>=”,”>=”)
#Finding the optimum solution
opt=lp(direction = “min”,objective.in,const.mat,const.dir,const.rhs)
#Objective values of x1,x2,x3 and x4
opt$solution
#Value of objective function at optimal point
optSobival

From the above we have got the best estimates, 3 – Item2 and 1- Item3 has to be bought to get required nutrients at a minimum cost of 0.9$

Advantages of Linear Programming

LP is useful for the business as the decision-maker can obtain an optimum solution by considering the effective use of scarce resources
It is a structured technique and helps in making informed data-driven decisions
It provides alternate solutions which the decision-maker can analyse further and finalise based on subjective matters that also need to be considered
LP can also be used for changing situations. Changed constraints or additional constraints can be included in the model to get revised output.

Application of Linear Programming

LP is widely used in Production to decide on Production mix, in Production Planning, Assembly line balancing etc.
Oil industries use LP to get optimum quality of oil by blending the right proportion of different crude oils, octane etc.
In finance, decisions have to be taken on where the money should be spent and from where the company needs to get the money from to ensure that they maximise the returns keeping the risks under acceptable control. Buying and selling bonds, managing corporate finances, making financial decisions.
Media selection, plant location, reducing travelling salesman’s cost are some of the areas where LP is used in the sales and marketing space.
In call-centres and hospitals, LP is used for scheduling the shifts for employees.
Defence makes use of LP to decide on the optimum level of force deployment to a particular place considering the different situations and constraints.

If you wish to learn more concepts in Data Science for a successful career in the domain, sign up for our data science online course and upskill today.

Reference Books

‘Quantitative Techniques for Decision Making in Business’ by Trueman, R.E. Half Saunders, New York, 1981
‘Quantitative Techniques for Decision Making’ by Anand Sharma (IIT Delhi)

Find Data Science Course in Top Cities in India

Introduction to Linear Programming for Data Science

Introduction

LP Application Requirements

Formulation of LP Problems

Linear Programming Examples

Problem Statement 1

LP Formulation:

Summarise the LP Problem:

Graphical Solution:

Problem Statement 2

LP Formulation:

Summarise the LP Problem:

Solution using Excel Solver:

Problem Statement 3

LP Formulation:

Summarise the LP Problem:

Solution using R:

Advantages of Linear Programming

Application of Linear Programming

Reference Books

Top 10 Data Science Leaders You Should Follow in 2024

Top 8 Data Science Podcasts you should be listening to in 2024

8 Top Data Visualisation and BI tools in 2024

Top 6 Data Science Projects To Get You Hired

How Data Science Solves Real Business Problems