DoS Attacks on the Internet

 

Arie Vayner

ariev@vayner.net

 

Feb 2003


 Part 1 – What is an attack

 

1. What is a DoS/DDoS attack

 

DoS (Denial of Service) attacks today are part of every Internet user’s life. They are happening all the time, and all the Internet users, as a community, have some part in creating them, suffering from them or even loosing time and money because of them.

DoS attacks don’t have anything to do with breaking into computers, taking control over remote hosts on the Internet or stealing privileged information like credit card numbers. Using the Internet way of speaking DoS is neither a Hack nor a Crack. It is a whole new and different subject.

The sole purpose of the attack is to disrupt the services offered by the victim. While the attack is in place, and no action has been taken to fix the problem, the victim would not be able to provide its services on the Internet. DoS attacks are really a form of vandalism against Internet services.

DoS attacks take advantage of weaknesses in the IP protocol stack in order to disrupt Internet services.

 

DoS attacks can take several forms and can be categorized according several parameters. The main distinction between the DoS attack types takes into account where is the origin of the attack being generated at.

 

“Normal” DoS attacks are being generated by a single host (or small number of hosts at the same location).

The only real way for DoS attacks to impose a real threat is to exploits some software or design flaw. Such flaws can include, for example, wrong implementations of the IP stack, which crash the whole host when receiving a non-standard IP packet (for example ping-of-death).

Such an attack would generally have lower volumes of data. Unless some exploits exist at the victim hosts, which have not been fixed, a DoS attack should not pose a real threat to high-end services on today’s Internet.

 

DDoS (Distributed Denial of Service) attacks would, usually, be generated by a very large number of hosts. These hosts might be amplifiers or reflectors of some kind, or even might be “zombies” who were planted on remote hosts and have been waiting for the command to “attack” a victim. It is quite common to see attacks generated by hundreds of hosts, generating hundreds of megabits per second floods.

The main tool of DDoS is bulk flooding, where an attacker or attackers flood the victim with as many packets as they can in order to overwhelm the victim.

The best way to demonstrate what a DDoS attack does to a web server is to think on what would happen if all the population of a city decided at the same moment to go and stand in the line of the local shop. These are all legitimate requests for service – all the people came to buy something, but there is no chance they would be able to get service, because they have a thousand other people standing in line before them!

 

Figure 1 – DDoS attacks

DDoS attacks require a large number of hosts attacking together at the same time (see figure 1). This can be accomplished by infecting a large number of Internet hosts with a “zombie” or agent program, which connects back to a pre-defined master hosts. This way, an attacker can be anyone with a certain knowledge and access privilege with the master host (such as the correct password to an IRC channel). All he has to do is enter a few commands, and the whole zombie army would wake up and mount a massive attack against the victim of his or hers choice. [3]

The zombie program can be planted on the infected hosts in a variety of ways, such as attachment to spam email, the latest cool flash movie, a crack to a game, or even the game itself.

Communication from the zombie to its master can be hidden as well by using standard protocols such as HTTP, IRC, ICMP or even DNS.

 

DDoS attacks are quite common today, and they pose the main threat to public services because when a distributed attack is being generated against an Internet service, it is quite hard to block thousands of hosts sending flood data, or even legitimate requests.

Another aspect of most DDoS is that they consume a vast amount of resources from the network infrastructure, such as ISP networks and network equipment. This fact makes such attacks even more troublesome, because a single attack targeted against a minor web server, might bring the whole ISP’s network down, and with it affect service for thousands of users.

 

 

2. Some history of attacks

 

 

The history of DoS attacks can be traced back as far as 1996, when the first CERT (Computer Emergency Response Team, http://www.cert.org) begun publishing alerts [1]. Some of the attacks got famous because the attacked service has very high visibility on the Internet.

 

The early attacks were relatively small and used simple tools. On January 1997 a Romanian hacker SYN flooded UnderNet, a large IRC network, bringing it down for a while (http://www.wired.com/news/print/0,1294,1446,00.html).

 

In 1998 a new form of attack became popular. The tool for the attack is called smurf. This attack has brought several ISPs to halt during early 1998.

 

One of the most famous attacks were the attacks on during the 7-11 of February  2000, when sites like Yahoo!, CNN, E-Bay and others were taken off the air for hours due to DoS attacks against them. (http://www.cnn.com/2000/TECH/computing/02/09/denial.of.service.03/)

These attacks were traced back to a 15 years old boy from Canada, who held the alias MafiaBoy. He was prosecuted and was found guilty.

 

Since then there were virtually thousands of DoS attacks against public Internet services including the White house’s web site, Microsoft, NY Times and many more.

 

CERT’s backscatter research [1] has shown that 12,805 separate attacks were recorded against over 5000 distinct Internet hosts in a 3-week period. This accounts for about 4000 attacks per week!

This result shows that DoS is a real threat, and the Internet community must be prepared to deal with it.

Today a large number of organizations are interested in fighting DoS attacks. These organizations include ISPs, e-commerce companies, banks, governments and security organizations.

 

3. Why attacks exist

 

The reason for DoS attacks is not completely clear. The most reasonable answer is that it has several different reasons, and DoS creators have their own motives behind the attack.

The most common stated motives include

 

What is for sure is that it is very easy to launch DoS attacks, and almost anyone can generate them. This reason by itself fuels up the existence of attacks, because almost anyone would like to “just try once this cool tool I just downloaded”, and the Internet, as a whole, would see a massive number of active attacks running at any point in time.

 


Part 2 – Attack types

 

DoS attacks can be classified into several major attack types. Classification can be made in several ways.

A common way to classify DoS attacks is:

 

Another way to classify attacks is to examine the source IP of the generated packets. Many attacks are using randomly generated source IP addresses, or spoofed addresses.

 

The actual source of the attack can be used to classify it as well. Some attacks are actually being originated on the attacker’s host. Other attacks are being originated by zombie hosts, or other relay servers, and not by the attacker’s host.

 

We would present the means available for generating DoS attacks, and review some of the existing attack forms.

1. Ammunition

 

The Internet infrastructure, operating over the IP stack, offers a large number of possibilities for creating DoS attacks.

 

Some attacks might change standard IP parameters like source IP address to fool the victim, and make the attack very hard to track (source IP spoofing). Changing the source IP address is possible because routing in the Internet is done using only the destination IP field. The source field is used only on the final destination, for example, when the server has to reply to the originator of the session.

 

DoS attacks can use legal request for service such as sending TCP SYN packets to servers, or sending a large number of DNS requests to a DNS server. Such an attack can change standard parameters in the IP packet in order to make the attack more effective.

If a server gets a connection request (TCP SYN packet), it would handle it as a legitimate request, assign resources to it, and then would try to reply to the source. The source IP might be a randomly assigned IP address, so the reply would be directed to a spoofed destination.

 

When using UDP, an attacker may use the fragmentation capability to send a large number of fragments, which has no meaning. The victim would spend a considerable amount of resources (memory and CPU) to try and assemble the fragments and make sense of them, and in the mean time it would not be able to provide service to legitimate clients.

 

The ICMP protocol is another common tool for generating attacks. For example, the fact that almost any IP host would answer an ICMP ECHO request has led to several attack tools. The ICMP UNREACHABLE messages have created another section of attack tools easily used by attackers.

 

Another way of tweaking with IP packet fields is to change any combination of parameters to an illegal value. Such an attack might be useful because it might bring down a host with a relatively small number of packets if the attacker has found a specific vulnerability in the IP stack of the victim.

Another useful aspect of this kind of attack is that it might fool devices which try to protect our host against the attack. For example Cisco routers would ignore all packets with the Fragment bit set while using ACLs (access lists). This way we can slip some packets through the ACL, even though they are not really fragmented, and make the attack effective against the allegedly protected host.

 

Many ready DoS and DDoS tools exist today on the Internet. Every tool is using different ways to generate attacks. The common thing between all of these tools is that they make it very easy for almost anyone to mount an attack – all you have to do is download the tool, and use it.

 

The available attack tools have been developing throughout the years, and some of them are quite complex and dangerous.

 

The first form of DDoS attack tool was the “smurf” attack tool (and later “fraggle” and papa-smurf). This tool enabled users to exploit the fact that all IP hosts answer ICMP ECHO requests (ping). This tool has been release at July 1997.

 

The tools that have been released next include tools such as Trin00, TFN (Tribe Flood Network), Stacheldraht (“Barbed Wire”), WinTrin00, Shaft, Trinity and many windows based IRC tools. These tools are pure DDoS flooding tools, which enable the attacker to build a network of “zombies” or daemons which sit and wait for a command from a central master node to attack some host on the Internet.

Each tool has its own way of communicating with the central master host or hosts (often called handlers), and each tool enables the attacker to choose which kind of attack would be executed, including simple ICMP ping floods, TCP SYN attacks, UDP floods etc.

All these tools require the potential attacker to install the tool’s daemon on the “zombie” computers by hacking into them or by exploiting known vulnerabilities to copy and execute commands on remote hosts (for example UNIX “r” commands on misconfigured hosts or older unpatched versions of Windows). Another common way of distribution is using spam email.

 

The next stage of tool development, which started at the beginning of 2001, was the development of the Internet worms such as Ramen, Code Red, Code Red II and Nimda.

These worms were able to infect a large number of servers with the worm’s code using several vulnerabilities in applications like web servers and web browsers, creating a vast number of “zombie” hosts ready for activation as attack generators.

 

Some of the worms (such as Code Red) have created a new form of DDoS attack, because the worm itself, after being installed on a host, started scanning large portions of the Internet IP space for other potential victims for infection. Because a large number of hosts were infected, and all infected hosts started scanning the Internet, this has increased the overall load on the Internet core (see figure 2).

A special effect was noticed by users who used misconfigured Cisco routers, which were having a hard time managing ARP tables because of proxy ARP functionality. Many such routers just crashed because they ran out memory.

 

Figure 2 - Code Red infection rate

 

Another brand of DDoS tools that appeared at mid-2001 are the IRC bots such as “knight” and “kaiten”. These are “zombies” or bots, which are being distributed between unaware victims, who install them on their hosts (for example by using small games with the zombie installer included or spam email).

These zombie programs connect to a pre-defined IRC channel, and report back home. Anyone with knowledge and access to this IRC channel can issue commands to this army of zombies to launch massive DDoS attacks of any kind.

 


2. Weakness exploit attacks

The first type of DoS attacks we would describe are the attacks which exploit some kind of a weakness found in a software, operating system or protocol.
Most of these attacks rely on the fact that systems have bugs, or that programmers and developers are careless. Chances are that any system would have some possible uses that were not planned in advance. Sometimes using it in such unplanned way may crash the whole system, creating an effective DoS attack.

Usually such attacks do not require high bandwidth or any out-of-the-ordinary resources. Instead we are using the knowledge of the existence of a weakness to bring a system down.

 

The simplest example for such an exploit is a piece of simple C code:

 

        int unsafe_func(char *s1; char *s2)  {

                if (strcmp(s1,s2))

                        return 0;

                 else

                        return 1;

        };

 

If this function would be fed with input we got from a user over the network (for example in a URL parsing code), and the input would be formatted in a certain way, we might crash the whole system down.

For example, if we would get a very long URL, which exceeds a pre-defined limit, this function call may crash the application.

 

This is a very simple example for an exploit, which is easily fixed by good programming habits (in this example using the strncmp function would have fixed the issue).

 

This part would explore two well known network exploits: TCP SYN floods and fragmentation floods.

2.1 TCP SYN floods

 

Every TCP connection, before transiting to the established state, has to pass the 3-way handshake stage. This stage is being performed by the 2 participating hosts upon a connection request issued by the session originator. Its purpose is to establish a stateful connection which would enable data integrity management on the transport layer.

The 3-way handshake is based on sending 3 TCP packets (see figure 3) without data inside them (the last packet may contain actual data). The connection request is using the SYN flag in the TCP header to signal the other side it would like to open a connection. If the connection is feasible on the receiving host, it would send a packet with both SYN and ACK flags turned on. This would signal the originating host that the connection is open, and it would answer with a packet with only its ACK flag turned on. The last ACK packet may contain data, and from this stage the session is active, and both sides can pass data over it.

This sequence is often being referred as the SYN, SYN-ACK and ACK sequence.

 

Figure 3 – 3-way handshake

 

 

This protocol works quite well when the number of connections created in a given length of time is sane, and the connections are genuine. The server would not have to wait long for the SYN-ACK packet, and would transit the new session to established mode. When it is in established mode, the server would usually provide the user with the requested service.

 

For the attacker, this mechanism offers an opportunity for bringing down the system, especially if it has a weak design.

In older versions of several operating systems (such as Linux with old kernels such as 2.0.x) the implementation of the TCP layer was built in such a way, that the kernel held a static buffer which was pre-assigned specifically to hold data structures for sessions that are still in the hand-shake state (SYN-RECV state). This means that each time a server would get a SYN packet it would assign one slot in the buffer, and fill it with the sessions information (such as Source IP, TCP ports etc.). This information would enable the server to process correctly the third packet in the sequence, the ACK, because it keeps state of the connections status as an embryonic connection which is half-open (SYN-RECV state).

 

The attacker may exploit this behavior, and send a relatively large number of false SYN packets (see figure 4), while forging (“spoofing”) the source IP address and the source TCP port. Upon receiving the fake SYN packet, the server would assign an embryonic connection for the new incoming request, and would send a SYN-ACK packet to the spoofed IP address. Because the source IP address used may not even be online, and has not initiated a sessions with the server in any way, no ACK packet would come back to the server (we may receive RST packets). If the timeout is long enough, and the buffer small enough, the server would quickly run out of buffer space, and would not be able to service new connections. In such a state, legal connection requests would be denied because the server is low on resources (embryonic connections buffer space) and the attacker has created an effective DoS attack.

 

Figure 4 – SYN Flood

 

On UNIX environments this aspect of the kernel is controlled by a parameter called “somaxconn”. Its value has to be set at kernel compilation time and it would tell the kernel how many buffers have to be assigned for embryonic sessions. Older versions had defaults of 5, which was very easily overflowed. Newer versions have a default of 128, and can be expanded for busy servers.

The same problem exists on Windows servers, and is being solved in a similar way.

 

This solution is not enough for solving this problem. Making a buffer larger makes it only harder for the attacker, but still, all it has to do is increase the SYN packet rate, and finally the new limit would be reached.

Other solutions exist for this problem. In order to decrease the risk of being hit by such an attack the timeout for each embryonic connection can be decreased. The original values for this parameter were based on high latencies and packet loss ratios on the old IP networks, which are not relevant today anymore. Values as low as a few seconds should be enough for current systems.

Another solution to the problem is to minimize the amount of storage area used by embryonic connections, so that we would be able to have more of them in the kernel space. In order to still be able to handle ACK packets which would arrive later on for these dropped connections, we have to store the sequence number assigned to the session, and the MSS parameter for the TCP session. Other information is redundant.

This data can be packed into a single 32 bit number. Upon receiving the ACK packet, we would calculate a hash function using the received sequence and the stored cookie, and if it computes, we can move the session to the ESTABLISHED state, and only then allocate real system resources for it. [4]

 

It is evident that any way we choose to fight this type of attack would not help us as the rate of the SYN packets grows. Another factor we must take into account is that this might actually be a DDoS attack, being originated simultaneously from a large number of hosts, again increasing the attack’s volume. In such cases there is nothing a single server can do about it. As big as his buffer would be, it would eventually be overflowed, and new connections would not be passed through to the application layer.

 

For example, let’s say we have a server serving 50 new connection requests per second. This would create about 64 Kbps of incoming traffic for the server (if we take into account the SYN and ACK packets, which take 64 bytes each).

An attacker can easily create a flood of 1000 false SYN requests. This attack would generate another 512 Kbps which is quite heavy, but from the bandwidth consumption point of view, is not considered as a heavy attacks.

Let’s see what such an attack would do the server’s kernel: let’s assume that the server has been optimized, and it can hold up to 512 embryonic connections in its table (we assume timeout for stale connections is 1 second). In such a case, the table would be filled up in less than ½ a second (512 table places*1/1050 connections per second).

All connections requested for the following ½ second would be dropped. For a normal distribution between valid and invalid connection requests, only about 25 out of the total 50 legitimate requests would be handled, effectively blocking service to clients.

 

It is important to note, that mounting such an attack does not require hard-to-get resources from the attacker. A single computer with a decent connection to the Internet (even sub-T1, or actually a 512Kbps link) would be enough, as we have shown above.

Another way to gather enough resources for such an attack is to gather enough infected “zombie” or agent hosts and actually perform a DDoS attack. For our example all it takes are about 18 users with 28,800 bit/sec modem connections. Not a big thing for a motivated attacker.

 

As we have seen, a single server is not a match for a moderate SYN flood attack. Instead of building larger and stronger servers, we could increase the number of servers, so that the load would be distributed between them. In such a way we hope to withstand such attacks, and still be able to provide at least some service to valid requests.

The way to do that is to use a load balancing device, which would be in charge of receiving all the requests (using a VIP – virtual IP address) and forward the requests to the servers.

 

Figure 5 – Load Balancing

 

If we examine our previous example, we can see that this approach can win sometimes. Let’s assume we have a server farm holding 4 servers with the same parameters as our old good server, each able to hold up to 512 embryonic connections in its table. The load balancer would assign each server, on average, 263 new connections per second. 250 of those connections would be invalid requests, and only 13 are valid. Each server would be able to accept the 13 valid requests, and provide service. Although its embryonic connections table would be quite crowded, the service would not be interrupted, and the attack would be useless.

It is important to note that it is still quite easy to overwhelm our server farm. All the attacker has to do is triple his effort – he would still be able to run the attack from behind a single T1 Internet connection, but this attack would disrupt the service for valid users in our example.

 

Increasing the number of servers in the server farm is always possible, but it might become quite expansive, and we can run into other barriers, such as the number of new connections per second our load balancer can service, memory limits at the load balancer etc.

 

The best way to fight SYN flood attacks is to build a service in a distributed topology, placing the same content all over the Internet (or at least in several locations). In such a way, when a user requests some content from the servers, he would be serviced from the server farm located closest to him. In normal operations, this would reduce latency, and improve performance. If such a service would be SYN flooded, each flooding agent would flood the server closest to him, because that is where he knows the victim is located (for example because of a DNS query). In such a way the attack would be split up between many sites. Even if some sites would go down, other servers, which are still up in different locations, would provide services.

In most cases, a configuration like this would still be able to provide service to at least some parts of the Internet, which is much better than not being able to provide any service at all.

This kind of service is being offered today on the Internet by companies like Akamai, Digital Island and others.

 

Let’s return to our previous example. Now we would use 5 distributed sites, each with 5 servers. Our assumption would be that only 2 sites are hit by the attack, because the zombie computers are located in a limited part of the Internet, adjacent to these 2 sites.

As we have seen, a single 4 server farm withheld the attack by itself, so our 4 servers in the 2 sites, should handle the attack in the same way. The only difference here is that the other 3 sites would provide normal service and would not even be bothered by the attack. Even if we triple the attack, and the 2 effected sites would go down, still our service would be available through the other 3 available sites.

 

2.2 Fragmentation Flood

 

Another type of attack that can bring service down is the fragmentation flood. The common thing it has with SYN Floods is that a low bandwidth or volume attack can bring the victim down.

A fragmentation flood attack exploits a feature of the IP protocol enabling fragmentation of packets. This feature exists so that links with smaller MTU values would be able to support IP traffic. When a packet is larger than the MTU parameter of the next-hop link, it would be broken up into several fragments small enough to fit the required MTU.

The IP standard allows any hop in the IP path to fragment a packet, while assembly is being performed only at the destination host. [5]

 

The IP header has a few fields built into it so that fragments assembly would be possible at the destination host. The “Fragmentation Offset” field (see figure 6 for the IP packet header) specifies the offset of the current packet’s payload from the beginning of the whole data block. The first fragment in the stream would have an offset of 0.

To be able to tell when we have all the fragments, IP uses the “more-fragments” flag in the Flags field. This flag would be 1 if more fragments should follow. The last fragment would have this flag set to 0.

The “Identification” field is also being used in order to associate fragments to the same data stream.

 

 

Figure 6 – IP packet header

 

The assembly process would start collecting fragments, placing them in their relative place using the Fragment Offset field. When all the holes in the block are full, and we got the last fragment (more-fragments flag set to 0), the original data block is reassembled, and can be sent for layer 4 processing.

 

The critical part of the assembly process is that we start collecting fragments even if we only received a single fragment. We will have to track the state for each such block we are trying to assemble. Every set of source and destination IP addresses, protocol number and a unique identification value would make the IP stack to allocate a new block accumulator.

The host performing the reassembly of the fragmented packets has to assign considerable resources in order to be able to keep state for all partial data segments. It has to assign enough buffer memory to hold the partial data buffer. Also, it would have to use CPU processing time to analyze every new fragment in order to be able to identify its correct position.

Usually such tasks would be performed in the kernel part of the operating system, because this is a low-level functionality of the IP stack. Being a low-level task, it would usually have a high priority in the system.

 

To exploit this functionality, an attacker could send out meaningless fragments, each with different information (such as spoofed source IP addresses). The receiver would be fooled to start the process of reassembly of the forged fragmented data segments, and might be wasting a considerable amount of resources in order to follow up with the incoming packets.

Using spoofed source IP addresses has another advantage from the attacker’s point of view - it effectively hides his location.

 

The attack can consist of a relatively small number of fragments, each belonging to a different data segment.

Each such fragment could use a random Fragment Offset value, making the victim of the attack to allocate random sized memory blocks (some implementations might assign a large enough memory block to accommodate the preceding blocks of the current offset – assuming it should anyway be on the way).

 

Because the packets needed for this attack are small (minimal IP packet would be enough), we do not require much bandwidth for such an attack. For example, a single T1 link (1.5Mbps) would enable us to mount an attack of more than 3000 packets per second (using 64 byte packets). From my experience this packet rate can easily bring a server down.

 

The most affected devices by such an attack are firewall servers, especially those that are being installed on standard operating systems, such as SUN Solaris or Windows NT. These firewalls usually would use the system’s kernel services for IP, and would only add needed functionality. The problem is that the OS would try and assemble the bogus fragments, and the firewall would wait for them to be complete, so that the layer 3 and 4 information would be available for examination. In order to make the firewall examine the packet, the attacker would use the UDP protocol, which has no real state machine, and has to be examined packet per packet on the firewall. This combination would bring a firewall down quite quickly.

 

In order to fix this vulnerability, we have to improve the way fragments are being processed by the operating system. It is known on many systems that fragments are being dropped on the spot, and are not being assembled. This approach would make the system invulnerable to such attacks, but might break some IP functionality. This approach is common because real-life fragmentation is quite rare in the Internet, and other methods, such as Path MTU discovery, allow users to use low MTU links.

Other systems, which are known to be vulnerable to the UDP fragmentation flood might block UDP altogether in some upstream router. This approach makes sense in many cases. For example, if we operate a web server farm, we would not expect to get any UDP traffic anyway, so blocking it would not affect functionality.

 


3. Overload attacks

 

When someone wants to take a victim down, the easiest way to do it would be just knock it out by brute force. This is the main concept of overload attacks. The attacker would overload the victim in such a way that there is no chance that normal service level would be possible.

If we look at normal day-to-day life, an overload attack would be a case where a 1000 people would decide to go into the local grocery store, and ask to buy milk. This will overload the grocery, and it would not be able to offer real services to other “legitimate” clients, who really need milk this morning.

 

On the Internet, overload attacks usually send a large number of packets to the victim in order to fill up the pipe it has to the Internet, or just overwhelm the server. The attacker does not try to even consider which types services the victim offers, and how can we exploit them in order to stop the service – this is a pure brute force method.

Being a brute force attack, usually it would be almost impossible to block it. If the victim has a T1 connection to the Internet, filling it up with junk traffic would effectively break Internet connectivity. Without cooperation and some good will from upstream providers, there is nothing the victim can hope to accomplish.

 

Several ways exist to mount such an attack. We would review a few known attacks, and explore the way they work, and how can they be defeated, or not.

 

3.1 Smurf Attack

 

The oldest overload flood attack is the smurf attack, named after the tool which is used to generate it [6].

 

The smurf attack is based on the ICMP protocol, using the ability to send echo packets (ICMP message type 8, code 0) to IP hosts and receiving a response (echo reply, ICMP message type 0, code 0). This is the same capability being used by the ping tool widely used on the Internet.

RFC 1122 [7], which defines the requirements from any IP host, states that every host must support the ICMP echo functionality, and respond with an ICMP echo reply to any ICMP echo request packet it receives.

 

The attack is quite simple in its operation. The attacker would send a continuous stream of ICMP echo request packets to some IP address on the Internet. This would force the receiver of the echo request packet to reply with echo responses to the source IP found in the request packet.

The first step of making the attack work is spoofing the source IP address of the ICMP echo request packets. The source IP addressed placed in the request would be that of the victim. This would result in a stream of packets from the attacker to some host on the Internet, and another stream of ICMP echo response packets from this host to the actual victim.

This kind of attack offers the attacker a high degree of anonymity, because the victim has no chance of finding out who is the original attacker. If the attacker shifts the man-in-the-middle hosts quickly enough, it would be very hard to trace the real origin of the attack.

 

The second step of creating a successful smurf attack is choosing the correct destination IP address for the ICMP echo requests.

The real new thing smurf attack has introduced is the usage of broadcast IP addresses for amplifying the volume of the attack in such a way, that a low bandwidth stream of ICMP echo requests would eventually generate a high-bandwidth attack towards the final victim.

When a host sends an ICMP echo request to a directed broadcast IP address, all the hosts on this subnet would receive it. According to RFC 1122 they must reply to the request, and therefore would send ICMP echo response packets to the spoofed source IP address.

This would result in multiple packets being sent to the victim for every single ICMP echo request sent by the original attacker.

If the segment used as amplifier had, for example, 100 hosts on it, the amplification factor of the attack would be 100. An attack of 100Kbps originated by the attacker can have a result in an attack of 10Mbps, if the amplifier network has enough bandwidth (see figure 7).

Of course the amplifier network would suffer as well, because it would be generating vast amounts of data, often filling its whole pipe to the Internet.

 

 

Figure 7 – smurf attack

 

A directed broadcast is an IP packet being sent directly to a destination IP address which describes the whole subnet of a certain network. A directed broadcast IP address would consist of a subnet network number, with all the host bits set to 1.

In the early days there was no real need to filter such packets, and they even have been quite useful for many things, such as looking for Windows hosts on remote IP networks etc.

An example for a directed broadcast address for the network 192.168.1.0 with a subnet mask of 255.255.255.0 is 192.168.1.255 (all host bits set to 1).
A common mistake is to think that only IP addresses with 255 in the end are directed broadcast addresses. If we change the subnet mask a bit, we can see that directed broadcast IP addresses can appear as if they were a host IP address. An example for that can be seen for a subnet using 192.168.1.0 but with a subnet mask of 255.255.255.192. This configuration splits the 192.168.1.0 - 255 ranges into 4 subnets, each having its own directed broadcast IP address (192.168.1.63, 192.168.1.127, 192.168.1.191 and 192.168.1.255).

IP addresses such as 192.168.1.63 can sometimes stand for a normal single IP host, and sometime represent a directed broadcast address. It all depends on what the subnet mask is.

 

If we try to look for a simple defense against smurf attacks, it would become obvious that blocking directed broadcasts is the easiest way. If we do not allow such packets to go over the Internet, the attacker would not be able to send the ICMP echo requests to the amplifier network, and this would be the end of smurf attacks.

The first issue with this approach is that we might run into some trouble with applications requiring this feature. As it is, such applications are quite rare, and generally can be ignored on the Internet. Some of these applications do exist on the internal corporate networks, but they too have their own solutions (like using WINS servers in Windows NT environments).

 

The more problematic issue with blocking directed broadcasts is that there is no way to know which packet is being sent to a directed broadcast address, and which is being sent to a legitimate IP destination. As we have shown, some IP addresses can act at both roles. The resolution to the current role can only be achieved by examining the network subnet mask, which is not known in the Internet core.

The only IP device which can block such packets is the last-hop router which is providing routing services to the network in question. This router is configured with the network address and subnet mask, and it has all the information required to know which packets are directed broadcasts, and block them.

 

This way of blocking the attack has some weak points. The attack is being generated by some attacker host, sending packets to a network he plans to use as an attack amplifier. If the router at this network is configured properly, the attack would be blocked, but only after the ICMP echo requests already have been routed over the WAN link connecting this router to the Internet. Sometimes, when the stream of ICMP echo requests is big enough, this can waste a considerable amount of resources on this WAN link.

Another concern with this kind of protection is that the responsibility for blocking the attack is placed into the hands of many network or IT administrators who manage their corporate connections to the Internet. Often, these people are not routing specialists, and are not familiar with all the options a router offers them, and might not even be aware of the risk they pose to the Internet, while operating a public attack amplifier within their network.

 

Another common way to block smurf attacks is to block all incoming ICMP messages. This method would make sure that no ICMP echo requests of any kind would get into the local area network, and we can be sure that this network would not become a smurf amplifier.

This method has a few weak points as well. first of all, it will drop legitimate ICMP messages, making it impossible to perform ping tests from outside the network, and even break some other standard IP functionalities such as using the ICMP “Fragmentation Needed and DF set” (type 3, code 4) for path MTU discovery [8].

Another issue with this blocking method is that it will not defeat some of the smurf-type attacks, such as “fraggle”, which is using UDP echo packets (like some implementations of the traceroute tool) instead of ICMP echo requests.

 

What ever kind of filtering is used, some filtering has to be done by all networks connected to the Internet, either on the edge or at the access layer, when clients connect. This requirement has been established by RFC2267 [9], which describes what kind of traffic should be filtered at various points of all networks.

 

Many equipment and software vendors have been active in this field as well. Many network equipment manufacturers have added features to block such unwanted traffic on the routers themselves. A good example is Cisco’s command “no ip directed-broadcast” which can be configured on any interface of the router. It will block any directed broadcast IP packets destined to this interface. After it was evident that this is a desired action by the majority of the Internet society, this command has been made active by default on all interfaces (since IOS 12.0). Other vendors have been creating similar features in their equipment as well. This way of operation for routers has been documented in RFC2644 [10].

Many software vendors have created defenses against such attacks as well. For example most UNIX flavors support the option of disabling replies to directed broadcasts received on their network connections.

 

Even today smurf attacks still pose a real danger to the Internet. Even though it is quite simple to render it ineffective by using all the methods we have discussed, still many networks can be found which can be used to amplify smurf attacks.

Several organizations are continually monitoring and scanning the Internet to look for such networks, in order to inform the network administrators, and ask them to secure their network. Such organizations include http://www.netscan.org/ and http://www.powertech.no/smurf/ which actively scan the IP address range.

 

 

3.2 Zombie attacks

 

Today, in order to be able to generate a real DDoS overload attack it is not enough to have a few computers connected to Internet via a T1 leased line connection. Even a DS3 link might not be enough to bring a major web site down.

In order to really be able to generate a massive amount of data in order to overload such a server the attacker would require a large number of hosts, each with a decent connection to the Internet. Summing up flood attacks from hundreds or even thousands of hosts can eventually generate a flood of hundreds of Mbps all directed against a single host or service (From my personal experience as Network Engineer at an ISP, such attacks are quite common and happen on a weekly basis. Even attacks of 100 or 200Mbps can be seen once in a while).

The basic way of operation of these attacks can be seen in figure 1.

Such a massive stream of data can overwhelm almost any system, including the network equipment, circuits and finally the servers themselves, even if a load balancer is being used.

It is quite obvious that such attacks usually hit not only the victim of the attack, but also affect the close-by Internet area, because such amounts of traffic would overload many circuits and might crash the routers providing service to other clients as well.

 

We would explore how such attacks can be made possible, and how can we protect the Internet against such attacks.

 

Most such attacks operate by building a large “army” of zombies which are spread all over the Internet. This army is composed of compromised Internet hosts which have a “zombie” or a trojan program installed on them.

In order to create such a collection of hosts a way must be found to break into a large number of systems, and install a new program on these hosts.

After installing the software it has to be able to contact some central location in order to receive commands and maybe even software updates. This control channel would enable the attacker to initiate DDoS attacks on any Internet host whenever it is desired [12].

 

There have been several means of spreading such zombie software on the Internet. The simplest way is to break in into systems using known (or even better – unknown) security holes and manually installing the software. This method works just fine, but it’s slow and cumbersome.

There are many other ways to distribute trojan programs. For example a hacker can write a nice small free game, and put the trojan’s installer inside. Now all that has to be done is to make sure that the link to game is famous enough and wait for people to download it.

Any person who would download the game and run it would install the zombie trojan on his or hers computer, adding it to the zombie army waiting for commands.

The carrier program can take any shape and form, as long as it would attract enough people and make them run it on their computers. It has become quite common to find MP3 songs and other media files to carry viruses and trojans inside.

Other ways of deployment have been seen on the Internet. Many worms have been introduced into the Internet with a single objective in mind – spread as widely and as quickly as possible. For example, one of these worms was the Code Red worm [11]. Code Red was a malicious Internet worm which was propagating through the Internet, using vulnerabilities found in Microsoft IIS servers. It has been installing itself on any system which was found vulnerable, and immediately would start scanning for other hosts with the same vulnerability. The spread of this warm was enormous, and its rate can be seen in figure 2. The worm’s effect on infected systems was defacement of the local web pages, and another, even more dangerous side effect was some portion of code, which has been waiting for a specific day and time to attack the white house’s web page http://www.whitehouse.gov. The attack should have generated a large number of TCP port 80 (http) sessions to the web site, and overload it with semi-legitimate traffic from many hosts.

The Code Red worm is a good example for software which has been spreading across the Internet by itself, without any help from the creator of the worm. This way of spreading trojans is quite effective, and has been done by other worms, except Code Red.

 

After a trojan has been installed on some host, it has to create a channel of communication with its “master” in order to receive commands and maybe even software updates.
This functionality is not a must, but it would make a trojan much more effective and the trouble of spreading it more worth while. A trojan without such functionality can be used for a pre-defined set of tasks, but after it has finished, it would not serve any purpose.

 

A trojan may use a large number of ways to communicate back to its master. The easiest way would be to open a TCP connection to a predefined host, and use this connection to receive instructions.

This way of communication makes the trojan quite vulnerable, because it would leave a very obvious trail behind it. Running the ‘netstat’ command on the infected computer would reveal the connection and with it the presence of the trojan and the IP address of the master node.

In order to hide the connection, the trojan might use well-used protocols, such as HTTP or IRC.

By using HTTP, the trojan can open an HTTP connection to a pre-defined web server, and use a specialized CGI page to post information to the server, and request new commands and data. Such a connection might be lost between other normal HTTP connections. Also, the connection can be relatively short, and repeat once in a few minutes. This will ensure that the trojan would be up to date, but stay hidden from the user.

Another common way to create the connection is use the IRC protocol. The trojan would connect to a pre-defined IRC channel on some public IRC network. In such a way, any person with access to this IRC channel can send text messages to the trojans, and program and activate attacks in minutes [3]. A famous trojan who is known to operate in such a way is Sub7.

In order to protect the master node, and make it harder for people to break into the system, it is quite common to encrypt the whole sessions using standard encryption software easily found on the Internet.

 

The type of attack generated by the trojan can vary, and it depends on which tools the trojan software is based on. Usually it would enable at least a few types of attacks, including TCP/UDP/ICMP floods. The master can control the type of the attack, the packet lengths, the destination IP and many other parameters. Virtually any kind of known flood attack can be mounted using such zombie trojans.

 

Protecting against such attacks is quite difficult because the volume of the attack may be so huge, that it would saturate many circuits and block all available bandwidth at the victim’s network area. This kind of an attack can hardly be beaten by any technique which can be installed at the victim’s facility. The amount of traffic would simply fill the connection to the ISP, and any devices placed behind it would have no way to deal with all the traffic.

 

The distribution approach we have discussed earlier would offer some protection, because the attack would be divided between many separate locations, hopefully overloading only some of them, leaving the others operational.

 

Another approach for fighting such attacks, is dealing with the real root of the problem. If we analyze the main risks behind zombie floods, we would find that if there were no zombie hosts, such attacks would have been eradicated.

This means that the real solution for the problem is try and make sure that as few as possible hosts would be infected. This would deprive the masters from having many zombies.

This can be achieved by a coordinated effort by many organizations. Software companies must provide patches and fixes to known vulnerabilities. System administrator and private users must follow update announcements and install these patches. ISPs and network managers must always be on the guard for new worms and viruses, and at least try to contain their spreading process.

In most cases (as the Code Red worm has demonstrated) the weakest link in this chain are the system administrators of Internet hosts. Many such hosts can be found connected to the Internet after default installations without the appropriate patches being applied. Many people install Internet services on their home PC just to “play” with it, but forget to configure them properly. Such hosts are easily broken, and I have seen such hosts compromised only hours after installation.

 

Another way of making sure that trojans would not be able to perform their work is to block their upstream communication channel with the master hosts. This feat can be achieved by using firewalls and personal firewalls at the edge of the network (i.e. enterprise networks or home computers). The idea is to run a strict policy of outgoing traffic, and allow only specifically allowed traffic. Personal firewalls, installed directly on the host can have the best effect for this problem. Usually, personal firewalls offer the user the option of allowing or denying traffic based on the application requesting it. In such a way, if an application is not allowed to transmit data to the Internet, and it tries to do so, the user would be notified, and would be able to take care of the situation.

 

The last few defense approaches have a major drawback. They all require the end user to be up to date with the latest operating system patches, to install personal firewalls, and in general become a smart Internet user. This is a big thing to request from the general public, and this is the main reason why we still have such an endless supply of zombie computers.

 

Another DDoS approach that is very close to the zombie flood attack is the client flood attack. The only difference in this kind of an attack is that the owner of the zombie computer is aware of the fact that he is going to be a source for an attack, and actively installs some software or follows some instructions in order to participate in such an attack.

Such attacks can be organized by building some web site explaining why we want to attack our victim (for example some political cause), and ask all our supporters to press on some link. This link may run a java application on the client computer, generating requests from the victim server every few seconds. If enough people would activate this application, then we have a successful DDoS attack against our victim.

 

 

 

3.3 Reflectors

 

Usually, when someone would attack an Internet host, he would like to maintain anonymity in order to avoid prosecution or just to avoid being exposed.

One of the means of keeping anonymity is using zombie hosts to do the dirty work for you. This way is quite secure, but it is possible to back track the original attacker via the zombie hosts or software [3]. Also the zombies themselves are exposed, and could be fixed quite quickly because the victim of the attack can report their real IP addresses to the owner or service provider.

Another way for the attacker to keep his anonymity is to use spoofed IP addresses, forging the source IP address of the attacks.

 

Using reflectors is another way to accomplish anonymity. The attacker would reflect the attack from other Internet hosts, and by that make it seem to the victim as if they are the attackers.

Reflectors can be used by many attack types. A single attacker host as well as many zombie hosts can use reflectors in order to hide the attack sources (see figure 8).

 

The main problem with reflectors is that they can be very secure Internet hosts, and still be used as reflectors. Almost any host which offers services to the Internet can be used as a reflector because it follows the IP standard. The basic idea is exploiting standard protocols which have a request-response sequences build into them. The request would be sent from the attacker to the reflector, with the source IP set to the victim’s address. The reflector would send the response to the victim, effectively reflecting the attack.

 

Figure 8 – Reflector attack

 

The simplest reflector attack is using the ICMP echo (ping) protocol. If an attacker would create a list of a large number of hosts replying to ping request (almost any Internet host would do that…), then it is possible to flood some victim host just by sending ICMP echo requests to these hosts, with the victim’s IP address as the source address. All the hosts receiving this traffic would reply with ICMP echo response to the victim host, flooding it with traffic.

The volume of the attack would not be amplified in any way, and actually some uses of reflectors would even attenuate the attack, sending less traffic to the actual victim of the attack [13].

 

Other types of reflected attacks can include any kind of Internet host accepting TCP sessions. Any such host, receiving a SYN request would naturally respond with a SYN ACK packet. If, again, the source IP of the SYN request is the victim, then the SYN ACK would be sent over to it.

If the attacker would choose a large number of reflectors, and keep the SYN rate to each of them at a sane rate, it would not generate a SYN flood on the reflector, and it is most likely that the traffic would be missed by the server’s operators.

If enough reflectors are used, and the attack is generated by enough zombie hosts, a massive DDoS flood can be generated in such a way.

Another type of such an attack can take advantage of the DNS protocol. The reflectors would be DNS servers, which naturally reply to DNS queries. If the replies are sent to the victim, we would generate an effective DDoS attack.

 

When dealing with such attacks from the victim’s point of view, it might be a bit confusing. If we look into an attack using web servers as reflectors, we would see a large number of SYN ACK packets arriving from the Internet, with TCP source port 80. Such packets are quite normal for enterprise traffic, where clients connect all the time to public web servers. Because this kind of traffic is normal on the Internet, it might be missed, and categorized as normal traffic.

 

Another problematic aspect of blocking such an attack is that if we block this kind of traffic, the regular TCP SYN ACK packets would not come through, and we would virtually block all TCP port 80 (HTTP) traffic to the Internet. Such a block, even if it is successful in blocking the flood traffic, would eventually create a DoS attack, because clients would be denied access to services.

 

On the other hand, if the victim is another web server, blocking such traffic may be very effective. If we analyze the traffic such a server is producing, we would find out that it has to be able to receive TCP SYN packets, and transmit TCP SYN ACK packets. There is no need in receiving any TCP SYN ACK packets, and such traffic can be blocked without any real effect (such a block might be a good precaution even if there is no active attack for web servers at general).

 

Almost all reflector attacks can be blocked in such a way, but special care has to be put into planning the policy. As we have seen, some configurations or applications would allow blocking the offending traffic without any effect at all, but other applications may suffer because of filtering done to block a reflector attack.

 

One of the most dangerous reflector attacks can be generated by exploiting the Gnutella protocol for P2P traffic.

Gnutella implements a “push” directive which enables a user to request a file to be sent back to him, while specifying the destination host to where the file should be sent to in the request body. The request might propagate through the Gnutella network, till it reaches a host which is able to provide the requested service. While it is propagating, all knowledge about the origin of the request and its path is lost, so there is no way of tracking it back.

If an attacker requests a large file to be sent to the victim, some host on the Internet would open a connection with the victim, and start sending the file over. This is an example of how a reflector attack can also act as a massive attack amplifier.

 


Part 3 – Protecting against DoS attacks

 

Because DoS and DDoS attacks has become a real threat to public and private services on the Internet, there is a constant demand to fight back against the attacks, and try to block them.

We would explore the means available in order to block attacks or at least make life much harder for an attacker to bring services down.

 

This is a continuing effort, and attack and defense methods change and evolve with time and experience.

 

1. Patching up the kernel

 

 

As we have seen, some DoS attacks are taking advantage of known weaknesses in operating systems, protocols and other mechanisms.

Many of these weaknesses have been fixed or made more robust in order to make systems more secure.

Also, we have seen that many DDoS attacks are made possible because many Internet hosts are prone to be exploited and broken into. Such hosts can become infected with trojan software and turn into a DDoS zombie hosts.

In many cases fixes to these exploits would be released by the vendor of the software, or by some other third party.

 

The general idea is that systems become more secure with time, as new updates and fixes come out. But the system would not be secure by itself, even if the vendor would provide all fixes quickly. Still the process of updating the systems has to be carried out by system administrators all over the world.

In order to make a system really secure, it has to be maintained and updated all the time. The system administrator has to follow announcements about new releases and updates and install them as quickly as possible.

 

In order to see how important system updates are for both types of risks we would explore an example for each of the types.

 

One of the first DoS attacks found on the Internet was the SYN flood attack. We have explored this type of attack in Part 2. As we have seen, there are some software solutions to this kind of attack that can be installed on existing systems, making them more robust.

Still, the availability of the patch or fix has not made existing systems more secure. The system administrator had to install the patch and update the kernel of the system.

 

The best example for why system updating is important is the Code Red worm. As we have seen, the Code Red worm has been propagating through the Internet using a familiar exploit in the Microsoft IIS software. The reason the worm was able to spread in such a quick rate over the whole Internet is that many system administrators had default Windows NT systems installed, with a default IIS installation. No existing service packs, fixes and patches were applied to these systems.

 

As we can see, system management is a critical aspect of DDoS attacks on the Internet. Systems which have not been updated with all required updates keep being a threat to the whole Internet, because they are a real potential for becoming zombie hosts, attack originators, amplifiers etc.

 

2. Finding the source of the attack

 

One of the issues with DoS attacks is finding who is attacker, so that we would be able to stop the attack at it’s source, and maybe even prosecute the offender.

The problem with finding the source of the attack is that it is quite easy to hide the real origin of the attack.

It is very common to see attacks coming in with a large number of source IP addresses. Many times these addresses are actually being spoofed and generated randomly. This makes it quite hard to know where the attack really came from, because we do not have any information about the actual source of the attack.

 

We would explore a few methods which help in tracing back the origin of a traffic flood. Some of these methods can and are being used by network operators. Other methods were suggested, but have not been made possible because of different reasons.

2.1 Router by Router, Hop by Hop

When an attack flood is arriving at a victim host, it is quite easy to find out which router is handing off the traffic to the victim. It would usually be the gateway of the segment where the victim is connected.

 

This router can be used as the first clue for tracing back the attack. The idea is to look where the flood is entering this router, in order to find the previous hop which is passing the flow. If we locate the interface which is bringing the flow into the router, we can look up the router connected at the other side of this link.

This process can be repeated on any router along the path. The last router in this chain would reveal the real originator of the attack’s flow.

 

Many routers offer tools that enable network administrators to accomplish this task.

For example, Cisco routers support a feature called NetFlow [14]. This feature enables collection of real time information about all flows flowing through a router.

The NetFlow feature enables network administrators to analyze the collected information in order to try and identify attack flows, and try to trace them back by following the origin of the flow, at each router.

 

 

Tracing attack flows in such a way has a major weak point. In order to trace the attack’s origin all the way back to the attacker, the network administrator has to gain access to all the routers in the flow’s path.

In most cases this would be virtually impossible. The reason for that is that almost any attack would traverse a few separate autonomous systems, each belonging to different providers. The chances for getting remote access to such routers are quite slim.

In order to complete such a trace, the victim would have to manually ask for cooperation from a number of service providers. This process might become quite complicated and time consuming, and in many cases, service providers are not happy to help with such operations.

Another limitation with tracing back in such a way is that the attack flow must be active throughout the backtracking process. If the attack stops, not backtracking is possible anymore.

 

Because of this limitations, in most cases no real tracing is performed, and the origin of the attack stays hidden.

 

2.2 Packet marking

In order to be able to trace back the origin of an attack, it is possible to generate additional information generated by routers in the path of the attack flow in order to provide the destination of the flow (the attack victim) with enough information to reconstruct the flow’s path, and eventually find the actual source of the attack.

 

Several methods have been offered which provide the infrastructure for this network feature. None has been commercially deployed.

We would explore 2 of the suggested algorithms in order to explore the possibilities they offer.

 

Savage, Wetherall, Karlin & Anderson have devised a method called Traceback or edge marking [15].

This method requires routers in the path of a flow to add information into the packets, as they are being forwarded towards the victim. A probability p is defined at all routers. A packet would be marked with additional information using this probability, making sure that for large flows with many packets, we would have enough packets marked by any router in the path.

In order to avoid adding more overhead to the IP traffic, the Identification field in the IP packet is used for adding the path information (see figure 6). This is a 16 bit field which is normally does not have any real use (except with fragmentation – which can create back compatibility issues).

The idea is to inject information about edges (or links) in the packet’s path. An edge is composed of the IP addresses of 2 adjacent routers and a distance counter, which would tell the victim the distance to this edge.

Because all the information is longer than 16 bits (we need enough room for 2 IP address, and a 8 bit counter, which requires a total of at least 72 bits) the information is compressed in such a way that the whole information block would be delivered to the victim only over a few separate packets.

The compression also adds a hash value used to authenticate the information passed on by the routers, making it hard for the attacker to falsify the information.

 

The whole concept is based on the assumption that when a host is being attacked by a DoS attack that has enough volume to make any harm, there would be enough packets in the flow to enable the victim collect all the information from all the routers in the path.

From tests done with the model, about 2500 packets are needed in a flow in order to construct a path to an attacker 20 hops away. Almost any worthwhile DoS attack transmits this much traffic in mere seconds.

 

After collecting all the information, the victim host has to analyze all the information it got from the routers, and run the algorithm in order to reconstruct the path back to the attacker. This process would provide the victim’s operator an ordered set of routers through which the attack has flowed. Of course all routers in the path have to support this feature, or else some hops would not appear in the path.

 

The main advantage of this system is that it is back compatible with routers that do not support it, and can be deployed in stages. If a router does not support this feature it would just forward the packet unchanged, not breaking the whole process. If enough routers would support this feature (especially routers near the edges of the path), traceback would be possible.

 

Another path reconstruction method was offered as an IETF draft [16]. This method defines a new ICMP message which is generated by routers in the path of the flow (ICMP Traceback message). Each router should generate these ICMP messages with a low probability (the draft offers a probability of 1/20,000) for every flow. The ICMP message would include information about the routers one hop away from the originating router at both directions. In such a way, the victim host would receive information about edges in the path of the packets in the flow. Also, some of the original packet would be placed into the ICMP message so that the victim host would be able to correlate all the information.

In order to make the whole system as secure as possible some authentication data is suggested as well.

 

After receiving traceback packets from enough routers in the flow’s path, the path can be reconstructed, and the origin of the flow located.

This information can be used to locate the origin of a DoS attack, but can be used for other implementations as well.

 

Again, this method is based on the assumption that DoS attacks would generate flows with many packets. Such flows would have a high probability for generating traceback messages.

Because normal traffic would not generally have such high volume flows, normal traffic should not trigger much traceback messages, so no real overhead is introduced into the network.

 

As it appears, locating the source of DoS attacks is theoretically feasible, but has not been implemented yet in any commercial system. The main difficulty with such implementations is that a large number of vendors would have to support these features in order to deploy such services in the Internet.

Another problem is that all the additional processing required might be a problem with core routers on the Internet, which have to forward thousands of packets per second. Such traffic volumes might require high CPU processing overheads in order to implement the above features.

 

2.3 Pushback

 

Another active mechanism that has been offered to allow automatic filtering of DoS attacks is pushback. Pushback does not offer a real solution for finding the source of a DoS attack, but lays an infrastructure for blocking such attacks as close as possible to its source or sources.

Pushback has been offered by Ratul Mahajan, Steven M. Bellovin, Sally Floyd,

John Ioannidis, Vern Paxson, and Scott Shenker as a method of detecting and controlling DoS traffic on the Internet. [17]

 

Pushback is divided into 3 main functions. The first stage in the process is to identify which flows are generating most of the traffic, and try to classify these flows into some manageable groups.

The next stage, after classification has been made, is to start rate limiting the offending flows on the local router, while monitoring the effect of the rate limit locally.

The last stage is the actual pushback mechanism. The local router would advertise the information about the offending flows to the upstream routers, requesting them to rate limit the offending flows closer to their origin.

 

This algorithm can be activated recursively, propagating through the upstream links, till it reaches the routers closest to the source or sources of the DoS attack, effectively blocking the attack (see figure 9).

Figure 9 - Pushback

Detection of offending flows is a critical stage in the whole process, because we have to be able to classify traffic correctly, or else we might rate limit legitimate traffic.

The detection process is based on trying to identify flows that take a considerable amount of bandwidth compared the rest of the normal traffic. This process includes looking for source or destination IP ranges which create too much traffic compared to the other ranges.

The detection process has to take into account that not all high-rate flows are DoS attacks. Some might be caused by flash crowds or under-provisioned networks, where congestion is caused by not enough allocated resources.

The result of the detection process should produce an aggregate which is some definition of the offending flows. Usually it would include source and destination IP addresses, and may include protocol and port numbers.

 

After an offending aggregate has been defined, the local router can begin with rate-limiting it locally using some rate-limit policy (ACC - Aggregate-based Congestion Control). The local ACC process enables the router to examine the decision that has been made to rate limit the flow, and check if it has any effect on the overall congestion of the links. This can be done by examining the drop rate on the output queue that has been assigned for the rate limiter.

The policy of the rate limiter has to be set by the router to such a level that the offending flow would not create congestion on the downstream links. This policy has to be created dynamically, by monitoring the traffic, and adjusting the rate-limiter according to congestion.

 

After it has been established that a certain flow or aggregate is generating malicious traffic, or the local router is overloaded by this flow, pushback can be used in order to contain the attack in the upstream routers, closer to the origin of the flow.

The local router has to try and estimate how much of the offending aggregate is arriving on each upstream link. On each such link the router would send a pushback message to the upstream router, telling it which flows has to be rate-limited.

The upstream router should examine the messages, and create a local rate limiting policy on the output queues, blocking the requested flow.

 

The whole process can be repeated recursively. If the upstream router, receiving the pushback message, cannot contain the flow by itself and is still being overloaded by the flow, it can repeat the pushback stage, locating the upstream routers which are forwarding the offending flow, and send new pushback messages to these upstream routers.

Using this algorithm the policy would propagate upwards in the network topology tree, where the root of the tree is the origin of the attack. The process should stop at the first router which is able to contain the flow by itself, or, in the worst case, at the next-hop router of the origin of the attack.

We have to bear in mind that if we experience a DDoS attack, it would have multiple sources, all aggregating to a single point, at the victim. In such cases, it is reasonable that the attack would not be blocked at the real attack source, but closer to the victim.

 

After the attack has been blocked on the upstream routers, the downstream routers still have to get some feedback, so that they would be able to detect when the attack stops. Because the downstream routers stop receiving the offending data, or at least its rate would be low enough so it would not be classified as an attack anymore, we would need some feedback mechanism. This can be accomplished by sending periodic feedback updates from upstream routers to the downstream routers which have triggered the pushback. The updates can include information about the drop statistics of the rate limiting queues, providing information about the existence of the attack.

If drop rates drop below a certain level, downstream routers should send updated pushback messages, changing the rate limiting policy gradually, in order to make sure that the attack has really stopped.

 

Pushback has not been deployed on the Internet because it has some implementation complexities. Pushback is based on the ability of routing equipment to create rate limit policy on a relatively large number of flows or aggregates. It might even create a new opportunity for attackers to exploit pushback, and force it, using a new attack type, to create numerous rate limiting policies. Routing equipment, especially larger routers placed at the core of the Internet cannot cope with such complex quality of service policies, and cannot support this requirement for large scale deployments.

Another major issue with pushback is that upstream routers must trust the downstream router’s pushback updates. If an attacker gains access to a downstream router, it can be used to propagate false pushback messages blocking valid traffic, again, creating a new type of attack.

Such trust between routers may prove difficult when crossing autonomous system boundaries, limiting pushback deployment, and making it less effective.

 

Detection of attack flows or aggregates is another challenge because there are numerous types of attacks, and many more are being developed all the time. Many detection schemes might generate many false positive alerts, blocking valid user traffic. Other detection algorithms may miss actual DoS attacks. There is no one full proof algorithm for detecting DoS attacks automatically.

 

3. What can a router do for us?

 

As we have seen DoS attacks are a reality on the Internet. If we examine the Internet’s components, we would find that the most common component (maybe except computers…) are routers. Router devices are the real core of the Internet, passing IP traffic all over it.

Routers, being critical components for the normal operation of the Internet, are also a critical component for DoS attacks. All DoS attacks pass through at least a few routers, while the packets travel from the source of that attack towards the victim. Recognizing this fact, we can try and use the routers as a tool against DoS attacks by carefully using existing tools inside the core and edge routers constructing the Internet.

 

In this part we would explore a few functions a router can perform in the fight against DoS attacks.

 

3.1 Ingress filtering and good policy

Any network connected to the Internet, has to use some kind of routing equipment in order to forward packets from and to the Internet. This router should be able to provide basic services such as filtering of traffic according to a pre-defined policy [20].

If we examine the traffic normally passing through the Internet, we can map a large set of rules that would describe types of traffic that is not expected to be routed on the Internet, and filtering it would not affect the normal user applications. This can be done by analyzing the applications that are being used on the Internet and the kind of traffic they require.

Another way of creating the filtering policy is to examine the kinds of traffic used in DoS and DDoS attacks, and try to generate some filtering rules, in order to explicitly block this kind of traffic.

 

After constructing such a set of rules, we can activate this policy on the ingress of all networks connected to the Internet, and by that blocking as much malicious traffic as possible from being propagated through the Internet.

This policy is called Ingress Filtering, and is suggested in RFC 2827 [18].

 

Examining the applications used by users on the Internet, we can find a set of rules which would describe traffic that is not expected to be received on the edge of any network.

Such traffic includes the following rules:

Rob Thomas has compiled a Cisco IOS configuration template which helps network administrators to configure Cisco routers in a secure way [23]. One of the critical things this template contains is a pre-configured access list which should be installed on any ingress port of any router. This access list contains all the bogus, unallocated and special IP addresses, and provides a good ingress filtering policy.

 

In many cases we can predict the types of traffic we expect on our network. If this prediction is possible, other types of traffic can be filtered without fear of dropping normal traffic. This policy is usually relevant with hosting providers, enterprises or other networks which know exactly what applications are going to be used on their network. If this distinction is possible, all other types of traffic can be blocked at then network’s edge.
Such kind of filtering can reduce the risk of many types of DoS attacks, such as reflector attacks, fragment flooding etc.

 

Another aspect of ingress filtering is a policy which should be used by Internet Service Providers at their access layer, where clients connect. The idea is that any client connected to the ISP’s network has been assigned an IP address (or an IP address range). The ISP can configure an access list, filtering all traffic which is using other IP addresses as source, and which is being sent by the client to the ISP. This policy is quite logical, because ISPs would normally not expect clients to use spoofed addresses, and there is no real need in doing so.

Such filtering, if deployed widely enough would enable ISPs to filter all DoS attacks which require spoofing IP addresses. Such filtering on Remote Access devices for dialup access or on Broadband access networks would make it harder for large zombie networks to launch many kinds of attacks.

Cisco has provided a special feature, enabling network administrators to configure such filters easily, by entering the “ip verify unicast reverse-path” command at any edge interface [21]. The router would use its normal unicast routing table, executing reverse path queries, and make sure that all source IP addresses being used by a client, are really routed back to him. If a packet is found with a source IP address which is not routed to the interface it came over, it would be dropped.

The main problem with this feature is when clients use asymmetric routing for load balancing, or redundancy, which may cause legal packets to be dropped. This problem has been addressed by Cisco with some enhancements to the RPF feature, enabling less restricting filters to be activated on the router.

 

Using ingress filtering on the edge of a network will not block all kinds of DoS attacks, but may provide a good policy which can in some cases reduce the amount of traffic generated by DoS attacks.

Ingress filtering is one of the best ways of reducing spoofed traffic flowing through the Internet. If service providers and network administrators would follow this policy, and deploy such filtering schemes, many kinds of attacks which rely on the ability to spoof IP addresses would become obsolete.

 

 

3.2 Null routing

One of the most effective ways of blocking DoS and DDoS attacks is by blocking them on core and edge routers, as close as possible to the origin of the attack.

When an attack is found, we would like to be able to filter all its traffic, so that downstream routers would be able to operate normally.

 

Examining the tools available on current routers, we can see that almost all routers offer some kind of filtering tool based on a set of rules, such as access lists.

Using these tools may provide a good solution for low-volume DoS attacks, but may become ineffective when the volume of an attack is very high. Usually such filtering policies are implemented in software, and require that all traffic would traverse a relatively slow and CPU intensive path inside the router. When dealing with high-volume attacks this slow forwarding path may present a real burden for the router, and even might totally crash it.

 

In order to avoid this problem, we would like to block such attacks without requiring any special treatment by the router to the attack’s traffic. This can be accomplished by modifying the forwarding table of the router, which is being used for normal packet switching. Because the packet forwarding process is a highly optimized in the router, we would not add any overhead by using applying a blocking policy.

The way we change the routing table is crucial for blocking an attack. Many routers include some kind of a bit-bucket interface (such as the Null0 interface on Cisco routers). All packets which are destined to this interface would be dropped by the router.

 

For example, if we enter the following command on a Cisco router, all packets sent to the 192.168.1.0 prefix would not be routed:

            router(config)#ip route 192.168.1.0 255.255.255.0 null 0

 

The problem with this blocking method is that it operates on the network layer, and would not allow the network administrator to block packets based on TCP port numbers or other transport or application layer information. This may cause a problem if a web server is being flooded with UDP packets. We would have to block all traffic to it, including packets with HTTP traffic, effectively blocking the attack, but also blocking the service itself.

Another issue with this blocking scheme is that it can be used to block traffic only based on destination IP addresses, because routers perform routing based only on this field.

Some extensions do provide the same ability for filtering based on source IP addresses, but they are less commonly used [21].

 

This blocking method has been made even more useful by a solution devised by UUNet engineers, and which is being used by UUNet on their North American network (AS701). This method is described by Barry Raveendran Greene at [24].

This solution provides a way for network administrators to create null routing policy on many routers (such as the whole service provider’s network) using a single central node.

This is done by advertising the prefix we wish to block within the network using the BGP routing protocol, while setting the next-hop attribute of the prefix to a pre-defined ip address. This ip address should be permanently null routed (as we have shown in the previous example) on all routers in the network.

When a router would receive the BGP information, it would perform a recursive lookup for the next-hop IP address, which is nailed to the local null interface. This configuration would install a null route for the desired prefix in the routers forwarding table.

 

This method can even span more than a single autonomous system, by allowing advertisement of such prefixes using eBGP routing protocol to neighboring ASs. By marking the prefix with a pre-defined BGP community attribute, the other AS can repeat the same process, matching this community value on all routers.

If this is implemented, a downstream autonomous system can detect a DoS attack against one of its hosts, and automatically pass the information to the upstream provider, blocking the attack as close as possible to the attack’s source. Of course this would require some level of trust between the two neighboring ASs.

This way of operation is possible today for peers peering with UUNet’s AS701, by using the 701:9999 BGP community.

 

4. Why firewalls are not enough

It is a general perception that firewalls offer a good defense against all the threats the Internet presents. Many network administrators, if asked about the defenses they use for their Internet connection, would answer that their firewall protects them from everything, and that their network is safe.

 

In order to understand the relation of firewalls and DoS attacks, we have first to examine what a firewall does.

Most modern firewalls are network elements designed to enforce some security policy on a network path. A security policy would usually contain a set of static rules classifying traffic into 2 groups – allowed traffic, which is forwarded to the other side of the firewall, and forbidden traffic which is dropped.

 

Most firewalls have evolved beyond the simple stateless rules machine, and provide more sophisticated stateful inspection capabilities for traffic passing through the firewall. Such features enable firewalls to follow certain types of traffic patterns in order to decide if the traffic should be dropped or not. Many firewalls are aware of the application layer of the sessions flowing through it, enabling even more complex policies.

 

Because firewalls are such complex machines, many firewalls have to deal with performance issues. In order to be able to keep state for many sessions at real time, and still be able to enforce such complex policies, firewalls would have to be build with high processing power, or else be limited to lower bandwidths or a low number of concurrent sessions.

Because of this limitation, most firewalls are installed on the edge of smaller links, such as enterprise connections to the Internet, or at least as close as possible to the host being protected, in order to distribute the protection effort as much as possible between many firewalls. It is quite rare to find firewalls deployed at the core of service providers networks.

 

As we have seen, many DoS attacks are based on the flooding concept, where an attacker floods the victim with as many packets as possible in order to overwhelm it. It is quite obvious that firewalls have no chance of helping with defeating such attacks. If an organization has deployed a firewall at the edge of their T1 connection to the Internet, and a flood is overflowing this link, no algorithm, however complex would be able to help with restoring service.

 

Another issue with firewalls is that they present a target for new attack types. Because most firewalls rely on standard operating systems for basic services they inherit some of the original weaknesses of the OS. Also, firewalls need to examine all the traffic flowing through it in order to make correct decisions about what action to take with each packet.
If we combine both factors, we can try and devise attacks that can hurt the firewall, and effectively create a DoS attack on the protected services.

For example, some software firewalls rely on the OS to provide IP fragmentation services to the firewall software. The OS would collect all fragments of a single packet, assemble the original packet, and pass it to the firewall for examination.

Because the firewall has to see the whole packet, it would wait for all fragments to arrive.

If an attacker would send a large number of small fragments, each belonging to a new packet, the firewall would allocate resources to the process trying to assemble the packets. Because we would never complete any packet, but create many false fragments, such an attack could exhaust the firewall’s CPU and memory resources.

 

Many firewalls are maintained in order to protect networks with many hosts. While examining the way DDoS attacks are created, we have seen that many such attacks are generated by many zombie hosts. Firewall policy has a very strong connection to the number of hosts that can potentially become zombies.

The most common way to configure a firewall is “block all incoming session, allow all outgoing sessions”. Usually this policy would have some holes in it, such as allowing email sessions to enter the network. If such an enterprise, with many hosts, would be infected by an email-carried worm – all its hosts may become one large zombie army. Because the firewall allows all outgoing connections, the attack would be allowed to get out, and the firewall has failed in protecting the network behind it.

 

Even though firewalls are not the full proof solution against DoS attacks, they do offer some protection against some attacks.

Firewalls can offer limited protection against SYN flood attacks, by monitoring the TCP traffic flowing through the firewall, and blocking illegal requests. Other advantages a firewall can provide is filtering services for many types of traffic we do not expect on our network, and which may be used as tools for DoS attacks (such as ICMP traffic).

 


5. Modern attack filtering techniques

Because DoS attacks are a still a real threat to many services provided on the Internet, it is still desired to find a full proof solution for blocking or avoiding DoS attacks, while being able to keep the service up.

 

Many companies work on solutions for the DoS problem. All solutions offer some kind of protection, at least against some of the attack types. If we examine some of the solutions, we would find that they provide the same basic services, although they may be implemented in a different way, and provide different levels and quality of protection.

 

The basic services these solutions provide are DoS attack detection, and DoS attack blocking or filtering.

All devices offer some kind of DoS detection and classification. Detection is often performed by tapping into the traffic flow and performing some analysis of the traffic. Many algorithms exist for detecting DoS attacks, most of them are proprietary, and are kept as a commercial secret. Many IDS (Intrusion Detection System) devices offer only the detection service.

Generally speaking, all detection process can be categorized into 2 different groups:

The first group includes algorithms that perform some statistical analysis of the traffic, looking for flows that stand out. Such flows may have a very high packets-per-second rate, show some strange packet distribution or any other non-standard traffic pattern. Such statistical models have an advantage of being able to perform correctly without really knowing what to look for, or what the expected traffic pattern is.

The weak point of such algorithms is that they may miss low-profile DoS attacks which, as we have seen, may actually do some real damage in some cases.

 

The second group of detection algorithms is based on the assumption that we can predict how a DoS attack flow would look like. The system would have a pre-programmed DoS signature database, and it would continuously try to match these signatures with the actual traffic flows found in the system.

The main problem with using such algorithms is that it would not recognize any traffic pattern that it has not been pre-programmed to detect. If a new type of DoS attack would be used, most chances are that it would not be detected. In order to update the system, it would usually require some manual upgrade procedure, which has to wait for the vendor to produce an updated signature database.

On the other hand, this kind of a detection system would usually detect even low-profile attacks, as long as they conform to one of the known attack signatures.

 

Many systems today employ some kind of a hybrid between both systems, allowing for a much better and safer detection process, which would work for almost any kind of attack, even if its signature is not yet known.

 

After an attack has been detected, we would like to try and filter it, or at least provide some kind of a solution in order to keep services up and running.

 

Many vendors offer products which provide some kind of protection against DoS attacks.

All products rely on the assumption that bandwidth is not the issue, and that the links between the victim and the Internet are not congested. Naturally, if the link is congested, there is no real filtering scheme that can help, because we would have many dropped packets even before we can do anything about this.

In order to make sure this assumption is true some vendors build their product so it would fit into the ISP’s network, where bandwidth should be plenty.

 

The first thing we have to accomplish in order to be able to filter DoS attacks is to be able to receive the attack’s packets, and be able to make forwarding or blocking decisions.

This can be accomplished in 2 different ways.

One solution to this requirement is to place the filtering device on the main path of the traffic. While it solves the problem, this approach has some weak points. First of all, any device placed on the critical path of data might become a new point of failure. This issue creates a real problem when placing equipment on the core networks of service providers, and adds some complexity to the devices by requiring support of high availability solutions. Another issue with devices placed on the critical data path in large service providers’ networks is the amount of data they have to analyze. Because all the traffic has to pass through the device, it has to support high throughputs so it would not create a new bottleneck in the core network. This problem is often solved by placing the filtering device closer to the victim, where we would expect to have less traffic. This approach may bring us back to the problem that our uplink connection is already too overloaded to be able to filter the attack.

 

The other approach is to take the filtering device out of the critical path of data, and create a mechanism to divert traffic to it only when filtering is required. This approach solves both problems we have seen earlier. Data does not flow through the device at normal times (when there is no active DoS attack). In such a way the device does not present any new points of failure in the network, and its throughput capabilities are not an issue for the core network’s designer.

In order to be able and filter DoS attacks traffic we have to be able to divert the traffic to the filtering device when an attack is detected. It would be desirable to be able to redirect only the interesting packets to the device, and keep the normal traffic flowing in the normal path (see figure 10).

 

Figure 10 – Traffic redirection

 

We would present 2 techniques to a