Traditionally, meter reading has been a manual process and it depended on physically recording the reading by looking at the dial. The utility company has been using technology to automate this manual meter reading process and hence replacing the traditional water meters with automated meters. Now sensors data are stored in SQL server but it was difficult or required much manual effort in:
- Anomaly or fault detection
- Usage demand forecasting
- Revenue Projection
- Capital planning
In this project, we have implemented AWS managed services to automate the process that will help to migrate data from on-premise to cloud and help in various type of analysis, forecasting, and planning.
It was difficult to manually analyse and predict the volume and variety of data. Hence our solution will help to address the below issue.
- Provide visual analysis result in form of a dashboard
- Provide various types of comparison dashboard that will help to find where the fault is
- Provide a solution to do analysis on the stream of data for a quick solution
- Provide a solution to migrate data from on-premise to cloud
Source of Data:-
Implementation in Details
Create an IAM role, EC2 Instance
IAM role and EC2 instance have been created for this project to migrate data from on-premise to cloud. We have used default VPC and created security group to make data secure.
We have created Dynamo dB to store both relational and non-relational data for analysis. Created schema and tables to store data which is migrated from on-premise.
Create replication instance and data pipeline.
In this project, we have created a replication instance with below details. Source and target endpoints and executed the migration task.
- Name: migrate-sql-instance
- VPC: defalult vpc
- Availability zone: us-west-2c
- Created a data pipeline to export data from DynamoDB to S3 bucket.
Descriptive data analysis:
Create an S3 bucket
S3 bucket is created to copy data from DynamoDB for further details analysis. It can be used to store logs as well. This is also used for Athena to analyse data in graphical format. There is another S3 bucket is created which is used for RedShift to store ad-hoc data for future analysis.
Create an SNS Topic
SNS Topic is created to track data migration. The topic has been subscribed by email id of the person who is going to track success/failure status of data migration activity upon completion of the data pipeline process. This will help the facility manager to get data insights using managed services.
Create EMR cluster and Launch Hue
EMR cluster and tables are created using External table upon the data which is uploaded in S3 bucket. Hue is launched to get data insight by doing a query on complex data. Hadoop helps us to store a large volume of data and do analysis quickly.
Ad-hoc and description data analysis
In this project, we did ad-hoc data analysis using Athena and QuickSight. Tables are created based on data in S3 bucket and pulled data from S3 bucket in Athena for descriptive analysis. We have used SQL based query for both structured and unstructured data. Dashboards are created in QuickSight from Athena for faster analysis and query.
Real-time data analysis
In this project, we have used Kinesis for real-time data analysis. In case of a fault, the alarm gets generated that needs to analyse quickly to minimise risks. Kinesis helped us with rapid and continuous data intake and aggregation for simple data analysis and real-time reporting. We have used AWS SDK to push data to Kinesis Stream and analysed data using Kinesis Analysis. Kinesis stream is used to inject data continuously and the SQL option is being used to do a query to runtime analysis. An intermediate S3 bucket is created between Kinesis Firehose and Redshift to store data into Redshift cluster.
Business Challenges & Technical Challenges
- Client Satisfaction
- Migrate data to multiple zones
- Untrained employee leading to slow performance impacting business
- As it is a hybrid model, maintenance of data will be costly
- Migrate data from on-premise to cloud depending on the nature of data.
- If your application stores and retrieves very sensitive data, you might not be able to maintain it on the cloud. Similarly, compliance requirements could also limit your choices.
- If your existing setup is meeting your needs, doesn’t demand much maintenance, scaling, and availability, and your customers are all happy, why mess with it?
- If some of the technology you currently rely on is proprietary, you may not be legally able to deploy it on the cloud.
- Some operations might suffer from added latency when using cloud applications over the internet.
- If your hardware is controlled by someone else, you might lose some transparency and control when debugging performance issues.
- Noisy “neighbours” can occasionally make themselves “heard” through shared resources.
- Your particular application design and architecture might not completely follow distributed cloud architectures, and therefore may require some amount of modification before moving them to the cloud.
- Cloud platform or vendor lock-in: Once in, it might be difficult to leave or move between platforms.
- Downtime: It happens to everyone, but you might not want to feel like your availability is controlled by someone else.
Learnings from Capstone Project
- Data Migration from on-premises to cloud
- Data analytics with Quicksight
- EMR and when an interface connects enabling
- Working with Hue and exploring Hue
- Runtime data analysis with Kinesis
- Redshift for big data storage
Gunjan Kumar – A senior software developer/mentor, having 14+ years of experience in the IT industry. Handling multiple projects with responsibilities including analysis, design, development, implementation, testing in an agile way. Skills include Web API, Microservices, SQL server using dot net framework and container. I have an interest in solution architect and big data analytics.
Suneeja Babu – Cyber/information security professional with almost 10 years of experience in various organisations. Currently, working for the Queensland Government as a Senior Security Consultant in Australia.
This Capstone project is a part of Great Learning’s PG program in Cloud Computing. If you wish to pursue the program, please connect with us.5