Machine learning is the latest to make waves in the field of Information Security, and for good reason. The support of complex algorithms that ‘learn’ and grow is invaluable to human analysts, allowing them to focus on larger tactical fights and strengthen security systems to be virtually bulletproof. In both routine and structural changes to Information Security, machine learning plays an increasingly important role and will continue to do so, leading into the coming years.
What is Information Security (InfoSec)?
InfoSec refers to the systems, tools and processes that are designed and then deployed to field sensitive and confidential data from being compromised or tampered with. Disruption, modification and destruction of data are some of the more common results of InfoSec breaches. The protection of digital and non-digital data falls under InfoSec. Security processes must account for the safety of data regardless of what format it is in.
Why is it important?
Technology continues to make significant inroads outside of IT to find acceptance in industries that were not “traditionally” in the purview of technology. The flip side is that there is an alarming increase in big data which, in turn, warrants the increased use of security measures to protect a growing clientele from data breaches and security threats.
These burning issues plaguing an increasingly digital world are indicators of the need for InfoSec:
- Sophisticated attacks: Hackers are using agile technologies and unseen malware to breach and compromise data– an issue that traditional security systems are hopelessly unequipped to deal with.
- Spike in the cost of breaches: By 2021, cybercrime is expected to rack up $6 trillion in losses, up from $3 trillion in 2015.
- Weak links in organisational systems: InfoSec was traditionally considered an IT problem– this couldn’t be further from the truth. Attacks could occur from any weak link in the company regardless of the hierarchy or department, so it is imperative that the entire enterprise is protected by seamless security programmes.
- Variety of threat categories: The plethora of causes and sources of security threats makes ensuring thorough InfoSec programmes that much more crucial. Threats could be due to deliberate acts of espionage, technical hardware mishaps or even basic human errors– a fool-proof InfoSec system is required to account for and act on a variety of threats.
Most InfoSec programmes are built around a trifecta referred to as the CIA triad, which stands for Confidentiality, Integrity and Availability.
- Confidentiality: sensitive data is disclosed only to authorised parties who have a right to access and view said data
- Integrity: sensitive data is protected from being deleted or modified by an unauthorised party and, if such data is deleted as a cause of human error by an authorised party, then the damage can be reversed
- Availability: sensitive data can be accessed by the right people, albeit through secure access channels safeguarded by authentication systems
How was InfoSec being treated over the past years?
Hacker attacks date back to the 1970s, even though network computing was still in its nascent stages and the internet was still in the works. The first-ever known method of hacking or targeted attacks was through the infiltration of phone lines that were connected to computers.
The story was no better in the 1980s– in fact, a group of teenagers in the US broke into more than 60 corporate and military systems to siphon off more than $70 million from banks. Since then, security systems constantly failed to thwart threats and keep data safe as hackers only grew more sophisticated, leveraging state-of-the-art technology for their crimes. By 2010, cybercrime was a serious enough offence to warrant decades in prison.
Perimeter protection (think antiviruses and firewalls) was once heavily relied on, but today’s security systems are multi-layered because no matter how high the wall, it is still penetrable. The focus turned to data itself, and how to keep it protected when (not if) a breach occurs. InfoSec transitioned from firewalls and antivirus software on individual computers to encryption of data at multiple levels. Data encryption also evolved to be employed at any stage, from digital file to data transmission.
Multi-factor authentication also began being used as roadblocks to hinder all but authorised personnel from accessing data, even in its encrypted form. This setup uses two or more authentication processes that go beyond passwords and PINs to prevent attackers by closing off immediate access.
Examples of data breaches
In 2016, nearly 3.2 million debit cards were targeted during what is now known as the 2016 Indian Banks Data Breach. HDFC Bank, State Bank of India, YES Bank and ICICI were the worst hit. As a result of this breach, the country’s biggest card replacement program was conducted– SBI alone reported the blocking and replacement of 6 lakh debit cards.
The personal data of nearly 50 million worldwide Facebook users was compromised in 2018 after a debilitating InfoSec breach. Facebook was reported to have lost $30 billion as a result and the firm was also put through a thorough investigation within the US and conducted by the European Union.
Once again in 2018, confidential data of nearly 9.4 million Cathay Pacific Airlines passengers were exposed. Although no misuse was reported by the Hong Kong-established carrier, the leak was a crippling one nonetheless.
Read also: How will AI and ML affect Cyber Security?
How can Machine Learning help Secure Data?
As hacks, threats and breaches grow increasingly sophisticated, the focus has turned to fighting fire with fire and staying one step ahead. Here is how machine learning is crucial to securing data in the face of large-scale breaches:
Finding Network Threats
By continuously monitoring data frameworks for anomalies or breaches, machine learning algorithms can effectively detect and deter threats. The ability of machine learning to process data in real-time is highly useful as it allows the detection of threats, insider breaches and malware as it occurs, preventing huge losses.
Protecting Cloud Data
Organisations are increasingly shifting their databases to the cloud to reduce the load on external servers and the hassle of maintenance. Machine learning can help secure data stored on the cloud by identifying and analysing suspicious cloud logins and carrying out the analysis of IP addresses and their reputation.
Homomorphic encryption is the process by which machine learning algorithms perform computations on existing encrypted data without having to decrypt it. The added perk of this process is that the results generated are also in ciphertext but, when decrypted, show the same results as they would have if the operation was performed on decrypted data.
Evading Hacker Attacks
By using methodologies like behaviour analytics and pattern recognition, machine learning can help prevent data breaches well in advance– a change from scrambling to recover losses after a breach. It helps organisations be one step ahead of hackers to offset potential attacks and strengthen protection beforehand.
Facilitating Endpoint Security
Machine learning can be used to train endpoint security setups in identifying anomalies and malicious activities based on what it has already experienced and flagged. Since machine learning thrives on volumes and larger datasets, endpoint security can be continuously strengthened against newer threats based on past data and repositories.
Examples of Machine Learning being used in InfoSec
A UK-based startup, Darktrace, uses machine learning as a base for its Enterprise Immune System which enables third-party organisations to detect malicious intentions faster and stall attacks before they even occur. The firm said they had mitigated threats to one NHS agency network during the Wannacry ransomware crisis in 2017. To put the threats into perspective, the ransomware had successfully breached security systems of around 2 lakh victims in more than 150 countries.
MIT’s CSAIL (Computer Science and Artificial Intelligence Lab) created a system. AI2, which reviews crores of logins each day to filter out anomalies and pass it on to a human analyst for review. The experiment that CSAIL carried out in partnership with a startup showed attack detection rates rising to 85% with a decrease in false positives. This development is also an example of machine learning being used to further automation to free the hands of analysts for deeper research.
Homomorphic encryption and the use of machine learning makes for a great case study in data ethics. There is also the concept of differential privacy, the mathematical framework of which is used to understand the extent machine learning algorithms can ‘remember’ information it shouldn’t, and make necessary changes to increase privacy guarantees.
Machine learning is all set to drive most Information Security efforts in the coming decade. These algorithms aren’t just providing protection against breaches, they’re also unearthing vital information and patterns that are invaluable to strengthening proactive security systems. As the new decade kicks into gear, organisations would be wise to invest in holistic InfoSec systems that envelop all ends of their setup in multi-layered ML-based ‘bubble wrap’.
If you found this interesting and wish to learn more, upskill with Great Learning’s PGP – Artificial Intelligence and Machine Learning course!3