Can Machine Learning and Artificial Intelligence Be Used For A Cyber Attack?

15 min readJun 13, 2020

Introduction

Artificial Intelligence (AI) and Machine Learning (ML) has been the future of technology. Deep Blue was created and played in 1996 against the world chess champion, Gary Kasparov, and won! ML has been growing into learning methods of automating actions in such a way that the machines can learn by itself to complete the task it is assigned. AI, on the other hand, started in 1943 by Warren McCullough where he published a journal article called “A Logical Calculus of Ideas Immanent in Nervous Activity” where it discusses the math of the neural network. Since then, AI has changed the perspective of the world. The latest news in regards to AI takes back to 2016 where Google DeepMind’s Alpha Go beats the world champion of the Game of Go (Builtin, 2019). The game of Go is a complicated ancient Chinese game that requires a lot of critical thinking from both players to win the game. This is a huge advance in the realm of AI as Alpha Go had to jump through a lot of hoops to beat the world Game of Go champion.

However, one cannot truly think that the purpose of AI is only for playing games. In today’s world, AI is being applied in many industries and situations. We generally see AI being used in our phones and online websites, such as Amazon, Google, Twitter, and many others. According to an article called “Hackers have already started to weaponize Artificial Intelligence” by George Dvorsky, hackers in today’s world have started to use AI as a method of attack because it helps them widen the scale of the attack. The method of using AI will ensure hackers to be safe and decreases the risk of them being caught from authorities.

In a hackers perspective, AI and ML are good tools to use to perform their attacks more efficiently and effectively without getting caught. AI and ML can be used to calculate the attack through decisions such as: “what to attack, who to attack, when to attack and so on” (Dvorsky, 2017). We need to understand how hackers use these tools for a cyber-attack as this can be very beneficial in enhancing protection. This articles will focus on whether tools such as AI and ML has the ability to be used for cyber attacks.

Machine Learning

ML is an application of AI that allows systems the ability to learn automatically and improve the experience without the need to have clear programming (Expert System, 2017). It focuses on the development of computer programs through the availability of data, which allows the computer to learn by itself via analyzing the data. The process will start with a concept of the neural network where there is an input, reward, and output of the machine. Depending on whether the machine performed correctly or falsely, the machine will be rewarded or penalized. Through this form of thinking, an output action will be performed as a result of the specific input from the user. (Expert System, 2017). A clear representation of this can be seen in Figure 1.

Moreover, according to an article called “What is Machine Learning?”, it is described as a form of unsupervised and supervised learning. Figure 2 further explains ML through a map that shows unsupervised learning via a cluster whereas supervised learning is split into classification and regression.

Unsupervised learning is explained as the machine’s ability to find “hidden patterns or intrinsic structure in data” (MathWorks, n.d.). The purpose of unsupervised learning is to draw conclusions based on the data set given, without the need of a specific response that is set up by the user. Clustering is the most common unsupervised learning technique used in the real world whereby, data analysis is used as an exploration into hidden patterns within normal/group sets of data.

Supervised learning on the other hand, is a model that makes predictions based on given evidence. There is a supervised learning algorithm that is commonly known as the ‘neural network’ where it takes a given set of data from the database and a known response will be calculated in result. The model is trained to think reasonably in making its predictions. “Supervised learning uses classification and regression techniques to develop a predictive model” (MathWorks, n.d.). The classification technique within supervised learning are used to predict discrete responses. A common example is knowing whether the email received is spam or not spam. The classification model “[classifies] input data” (MathWorks, n.d.). This is often used for medical imaging, speech recognition, and credit scoring, as your data is tagged, categorized, or separate in a specific group or class. For example, “applications for hand-writing recognition use classification to recognize letters and numbers. In image processing and computer vision, unsupervised pattern recognition techniques are used for object detection and image segmentation” (MathWorks, n.d.). The regression technique on the other hand, is a predicted continuous response. Users commonly see this in texting where there are multiple autocorrect options. Other examples of regression techniques include predicting the weather forecast; where the database makes a reasonable calculation in which what type of weather and temperature will be, based on the data provided. It is similar to plotting down coordinates on a graph and seeing what the data represents in terms of whether it follows the regression line.

Artificial Intelligence

AI is a form of a smart machine that can perform a task that requires human intelligence. This form allows the machine to perceive its surroundings and therefore, able to perform the appropriate action in its given environment. The AI should be able to think rationally like a human. Being able to think and act like a human is based on the machine’s ability to process reason. Once the machine is able to comprehend reason, it will then be able to apply it. An “AI is a set of algorithms and intelligence to try to mimic human intelligence. ML is one of them, and deep learning is one of those ML techniques” (Builtin, 2019).

It works similarly to ML, however, it does not focus on finding the most successful approach in solving a problem and the highest level of accuracy. AI instead focuses on stimulating natural intelligence in solving complex problems. AI is seen as an application in which it does smart work and ML is a concept where a user inputs data and the set algorithm would perform calculations and output an answer. ML ’s goal is to automate tasks in such a way that maximizes the performance of completing a task. ML attempts to learn new things whereas AI focuses on decision making. AI will find optimal data and ML will find the only solution that will simplify the problem and performs the task at hand efficiently (Sharma, 2018).

From the information that we have, the question that needs to be answered, is whether AI could be used for a cyberattack. According to an article called “Use of ML and AI in BlackHat hacking”, the answer is simple, AI and ML can be used for a cyberattack. How this could occur may include the use of smart botnets, where they are able to communicate amongst those that are on the same network, and can then simultaneously launch an attack on a cluster-type-base. These botnets are able to share information locally, therefore the higher number of botnets present results in an increase in intelligence, and it does not require end-user control. These botnets use hive nets, where they can learn for their past behaviors of an attack and therefore able to commit a better attack in the future. This allows the collective behavior of botnets to be decentralized from each other, therefore making an ‘Internet of Things’.

Moreover, the topic of Phishing and Social Engineering comes to play for cyberattacks, whereby AI and ML can be used to specifically target a large scale attack on an organization. AI commonly utilizes calculations such as text to speech, speech recognition, and Natural Language Processing (NLP) for the sole purpose of performing a social engineering techniques. Through the use of ML methods and repeating neural systems, AI will then train itself in such a way that allows it to perform social engineering tactics in the real world. This is because AI will be able to refine and perform a more polished social engineering attack in comparison to a human’s social engineering attack. According to Hackernoon, AI can perform more successfully than humans. An experiment was conducted that involved 90 participants and resulted in a success rate of 30 to 60 percent. AI conducting a Social Engineering Attack has been proven to be the future of tools for use in conducting penetration testing scenarios (Charan, 2019).

Method

In research of an AI and ML framework and environment, I have come across a Defcon video called “Weaponizing ML ” by Petro and Morris (Fox, 2017). Petro and Morris have posted their creation on GitHub called Deep Hack. The purpose of Deep Hack is to perform a Web Application Exploitation penetration testing. Deep Hack is a neural network system in which its purpose is to perform analysis on big data sets and have the deep learning aspect of the neural network system, focus on analyzing current Hackathon scenes. This information that Deep Hack has gathered can be applied to Web Application Exploitation testing.

Deep Hack functions as a web scraper that scrapes all hacking related news from a user-assigned website analyze the data. It then compresses the data found and imports it to a mongo database where it is analyzed through the method of regression and “various methods of data analysis to find out [the] correlation between different data sets” (Jonathan, 2019). The program has an approximation of 75% accuracy of determining its predictions and it is believed that they can increase its accuracy to 84%.

On a different note, I have found an ML framework in Github that is called Leaf. This framework helps teach users to learn how to build a classical, deep, or hybrid ML application where the user could build anything the user wants. Essentially a neural network would consist of “if statements” and “loops”. An article called “First neural network for beginners explained (with code)” by Arthur Arnx, explains how the coding behind creating a neural network works. The article focused on creating the perception in which an algorithm consists of two neurons as the input and one neuron as the output (Arnx, 2019). This allows the user to create a classifier that is able to distinguish between two groups. Arnx creates this basis of a neural network using python. He first imports data sets in where the program will be using, which defines what libraries it should have, parameters of the neural network, and creates a list of items in which the neural network will be focusing on. Figure 3 demonstrates how the code should be presented.

Then, Arnx created a function that defines what the output of the neuron should be. In total, it has three parameters, two of which are inputs, and one is an output. “OutputP” inside the code represents the variable concerning the output “given by the perceptron” (Arnx, 2019). At the end of the defined code, the error is calculated by modifying the weights of the output of the neural network. This can be seen in Figure 4.

After the creation of the defined code, a loop is created to repeat every situation inside the dataset. This is part of the learning phase of the neural network and the number of repetitions can be determined by the user. However, having too many repetitions would result in the ML being adapted to the new set of specified data. This would then result in a different output than what is intended. In other words, this could make the output of the neural network flawed, due to it having a different set of data being used inside the learning phase of the code. This could be seen in Figure 5.

The last part of the neural network would be having the user’s input. Inside the Perceptron code, Arnx uses the Heaviside of the function because the output of the data will always be 0 or 1, which is beneficial since the neural network is looking for either true or false results. Also, in another example, Arnx uses the sigmoid function on obtaining a result that is either close to 1 or 0. This can be seen in Figure 6.

The output of the data could be saved for future use for a bigger project to be done. Figure 5 is the last step in which the user would have to do to make a neural network. The overall code above has demonstrated how the code would learn by itself and can check its capacities. For future projects, users could modify the amount of repetition, length of the loop, and let the Perceptron do the classification. The neural network created by Arnx is very basic and can be used to further expand deeper learning.

Result

As a result of ML and AI, the world of cybersecurity has changed its methods of protection. ML is used with a specific data set, which is used to analyze data for insight and “anomaly detection capabilities” of malicious traffic to companies that are being attacked on the daily (Warmer, 2019). Bigger companies such as Google and Amazon have transitioned to the use of cloud computing to save up their resources in using ML or AI. This is because cloud computing works so much faster and is cheaper in comparison to having their facilities host ML and/or AI environments as they require a large amount of space to function efficiently. Cloud computing is the best method to proceed with creating a ML and AI environment working effectively and efficiently.

Moreover, in the eyes of White Hat and Black Hat hackers, AI and ML are very powerful tools to ensure that security is not breached or vice versa. At this point, ML environments are working very efficiently in today’s world in predicting, finding, and creating cyberthreats. According to the article called “How Artificial Intelligence and Machine Learning are Changing Cybersecurity” by James Warmer, AI is improving in such a way that it can analyze data sets from different hacking tools such as Snort, Wireshark, and others. Also, its ability to identify data in a more detail perspective will decrease human error in the realm of protecting information.

In addition, there are plenty of areas in cybersecurity that AI and ML have taken place for companies, governments, and personal usage. These areas include Cyber Threat Detection, Password and Authentication, Phishing Detection and Prevention Control, Vulnerability Assessment, Network security, and Behavior Analytics. In the realm of Cyber Threat Detection, ML is useful as companies and governments are able to take advantage and use it to identify a threat before an attacker could exploit the company or government’s vulnerabilities. ML allows computers to use adapt algorithm based on the data that is given, and learned by the machine, therefore understanding what needs to be done to achieve its goal. Old technologies are not able to improvise to new circumstances in comparison to AI because of how it is coded and set up. This means that old technology cannot keep up with the efficiency of AI, in regards to threats. Moving forward, governments and companies have chosen to go down the route of using the AI framework for Threat Detection.

Furthermore, passwords have always been a big issue for everyone in today’s world. This is because most of us are lazy at creating passwords and cannot remember them very well. A solution for this would be Biometric Authentication. An example of a Biometric Authentication would be face or thumbprint recognition, found in most smart devices today. According to an article called “How AI is Changing Cyber Security Landscape and Preventing Cyber Attacks” by Remesh Ramachandran, “developers are using AI to enhance biometric authentication” to get rid of the flaws of the Biometric Authentication system. A real example of this would be the iPhone X’s face recognition feature. The face recognition consists of processing the user’s facial features through a “built-in Infra-red sensor and neural engines” (Ramachandran, 2019). AI essentially creates a model of the user’s face by a focus on different angles of the facial features of the user. Apple claims that there is a one in a million chance of an AI breaking into the phone.

In regards to Phishing Detection and Prevention Control, AI plays a big role in terms of detecting phishing emails online. As the most common cyberattack in today’s world is a phishing email, AI is an important tool that is used to keep individuals safe. AI and ML could spot and log the route of approximately 10,000 phishing emails or more, at a time. After detecting the phishing email, AI can react and remediate the threat at hand. AI can distinguish between which is a phishing email and what could be a fake website. Companies nowadays start to use AI to start phishing campaigns and having AI to protect themselves from the attack.

From the perspective of attacks, it has come to vulnerabilities management, AI and ML are used to automate system vulnerabilities in the sense of reporting exploitation of the host systems. AI can analyze the data set that it is given and predict when and where an attack could be made on the set of vulnerabilities within the system.

Moreover, network security and AI, focuses on security network policies and an organization network topology. Having an AI will allow this process to be automated quickly because all the reconnaissance work is done by the AI. Also, in regards to behavioral analytics, it will have a pattern recognition algorithm where if a scenario of unusual pattern occurs, the activity will be flagged. These AI functions can be very useful for cooperation, government, and personal usage, as it can be part of defending and attacking a system (Ramachandran, 2019).

Conclusion

In conclusion, AI and ML are a huge forward in the cyber world because everything moving forward will be done through automation. Having AI and ML as part of our arsenal of tools to use to protect oneself from an attack or vice versa. It will be an interesting future to see everyone using AI as part of a career. The concepts of making an ML environment are not very difficult as demonstrated by Arnx.

In addition, there are different aspects to the cyber world in which different sectors such as Cyber Threat Detection, Password and Authentication, Phishing Detection and Prevention Control, Vulnerability Assessment, Network security, and Behaviour Analytics uses AI to perform their tasks. These tasks need to be automated and conducted by AI as hackers today have taken the responsibility of using both AI and ML to conduct their global attacks. Using AI or ML also decreases the human error of setting up a secured system, increasing protection overall.

Lastly, I think that it is really interesting how AI and ML has come a long way into today’s society in helping and challenging cybersecurity. Moving forward, cybersecurity would benefit from having their own AI and ML environment as part of their arsenal. It will be very difficult, as it requires a lot of hardware requirements such as CPU cores, storage, and GPU. However, once the AI is created and maintained, it will be a very powerful tool in which people could use to make the world a better place.

Reference

Arnx, A. (2019, January 13). First neural network for beginners explained (with code). Retrieved December 11, 2019, from Medium website: https://towardsdatascience.com/first-neural-network-for-beginners-explained-with-code-4cfd37e06eaf

Builtin. (2019). Artificial intelligence. what is artificial intelligence? How does AI work? Retrieved December 11, 2019, from builtin website: https://builtin.com/artificial-intelligence

Charan, H. (2019, March 31). Use of ML and AI in blackhat hacking. Retrieved December 11, 2019, from HackerNoon website: https://hackernoon.com/use-of-ml-and-ai-in-blackhat-hacking-737a621e4694

Dvorsky, G. (2017, September 11). Hackers have already started to weaponize artificial intelligence. Retrieved December 11, 2019, from Gizmodo website: https://gizmodo.com/hackers-have-already-started-to-weaponize-artificial-in-1797688425

Expert System. (2017, March 7). What is machine learning? A definition. Retrieved December 11, 2019, from Expert System website: https://expertsystem.com/machine-learning-definition/

Fox, B. (2017, August 10). DEF CON 25 (2017) — weaponizing machine learning — petro, morris — stream — 30July2017 [Video file]. Retrieved from https://www.youtube.com/watch?v=wbRx18VZlYA

Jonathan. (2019, March 11). DeepHack. Retrieved December 11, 2019, from Github website: https://github.com/pedsm/deepHack

MathWorks. (n.d.). What is machine learning? 3 things you need to know. Retrieved December 11, 2019, from MathWorks website: https://www.mathworks.com/discovery/machine-learning.html

Ramachandran, R. (2019, September 14). How artificial intelligence is changing cyber security landscape and preventing cyber attacks. Retrieved December 11, 2019, from Entreprenuer website: https://www.entrepreneur.com/article/339509

Sharma, A. (2018, February 19). Difference between machine learning and artificial intelligence. Retrieved December 11, 2019, from GeeksforGeeks website: https://www.geeksforgeeks.org/difference-between-machine-learning-and-artificial-intelligence/

Warmer, J. (2019, May 1). Human life will vastly improve thanks to AI. Retrieved December 11, 2019, from Business.com website: https://www.business.com/articles/machine-learning-artificial-intelligence-takes-off/