SEMINAR REPORT
ON
BILLY GOAT SYSTEM
BY
VENKATESH RAJU
T.E. Information Technology
Roll No. 352
2006-2007
GUIDED BY
Prof Ms S.B.BALRAWAT
BRACT’S
Vishwakarma Institute of Information Technology, Pune-48
Department of Information Technology, Survey No. 2/3/4
Kondhwa (BK), Pune-411048
CERTIFICATE
This is to certify that Mr. Raju Venkatesh Satyanarayan has successfully completed his Seminar on Billy Goat System in partial fulfillment of third year of degree course in Information Technology in the academic year 2006-2007.
Date: /02/2007
Prof.N.P.Pathak Prof. Ms S.B.Balrawat
Head, Information Technology Seminar Guide VIIT, Pune VIIT, Pune
Prof.Dr.A.S. Tavildar
Principal
VIIT, Pune
BRACT’s Vishwakarma Institute of Information Technology, Pune – 48 (12)
Department of Information Technology (12)
Survey No. 2/3/4, Kondhwa (Bk), Pune – 411048. (12)
ACKNOWLEDGMENT
I feel great pleasure in submitting this seminar report on “BILLY GOAT SYSTEM”. I wish to express true sense of gratitude towards my seminar guide, Prof Ms S.B.Balrawat who at very discrete step in study of this seminar contributed her valuable guidance and help to solve every problem that arose.
I would wish to thank our H.O.D. Prof. Mr. Pathak for opening the doors of the department towards the realization of the seminar report.
Most likely I would like to express my sincere gratitude towards my family for always being there when I needed them the most. With all respect and gratitude, I would like to thank all the people, who have helped me directly or indirectly. I owe my all success to them.
Venkatesh.S.Raju Roll No: -352
T.E (I.T) VIIT, Pune
ABSTRACT
BILLY GOAT SYTEM
What is Billy Goat System?
Billy Goat is a sensor designed system, built with the specific purpose of detecting and identifying network service worms. Because of this specific focus, Billy Goat can take advantage of worm specific properties that would hinder general-purpose intrusion-detection systems, toward more efficient and accurate detection.
What are the characteristics of Billy Goat System?
These requirements influence the desired characteristics of such a system, particularly in the following aspects:
1. Accuracy: The goal of a WDS is the identification of worm-infected machines. To offer real utility, it must be able to perform this task with a high level of accuracy, so that its reports can be trusted by system and network administrators as the basis for contention and remediation action. A WDS can use highly-specialized techniques to detect worm-infected machines. This enables increased accuracy, at the expense of the ability to detect a wider range of attacks.
2. Speed: Given the explosive nature of modern worms, a WDS should be able to detect an infected machine as quickly as possible, to provide its users a chance to contain the damage, or even to function as the basis for an automated response system.
3. Manageability: New worms and worm variants appear almost every day, so the components of a WDS need to be updated regularly. At a systems level, this process must be automated as much as possible, to be able to deal with the monitoring of very large networks. At middleware and architecture levels, this means the base infrastructure must offer sufficient flexibility to enable the rapid creation of new detection capabilities.
4. Interoperability: Many organizations suffer from the proliferation of security tools, each with their own control, monitoring and reporting mechanisms. Furthermore, many places already have some form of monitoring console, virus-response policies and procedures, etc. A WDS should integrate as much as possible with the existing tools and processes.
5. Resilience: A WDS must operate under extreme conditions in terms of network and processing load, particularly during worm outbreaks. These conditions are more likely to induce failures than other environments. However, a WDS has a specific advantage that is not enjoyed by most other IDS’s because of the repetitive nature of worm activity; the WDS can afford to lose some data without reducing its utility. In practice, this means it is satisfactory to build a system that can “forcefully” recover from failure (for example, by automatically rebooting or even reinstalling itself) rather than trying to resist it.
6. Graceful degradation: While WDS’s may benefit from a distributed architecture, most worm outbreaks have the effect of overloading network links. It is therefore necessary for all sensors to be able to operate on their own (for example, reporting only local data). Given this condition, while the global system may be impeded, its individual sensors can still be useful during a worm outbreak.
A Final Word
Billy Goat has been designed to be scalable, to operate gracefully in a large distributed environment, and to provide extremely accurate detection of worm-infected machines. This paper describes a number of interesting or useful techniques and components identified during the process, of developing “Billy Goat”.
TABLE OF CONTENTS
1. INTRODUCTION 07
1.1 Typical Worm Spreading Logic……………………………..…... 08
2. BILLY GOAT ARCHITECTURE 09
2.1 High Level Characteristics…………………...………………….. 09
2.2 Basic Architecture and Implementation…………………………10
3. ENGINEERING DECISIONS AND IMPLEMENTATIONS 13
3.1 Database Tables…………..………………………………………..13
3.2 Feigning Servers………...…………………………………………14
3.3 Address Virtualization…………...………………………………..15
4. BILLY GOAT DEPLOYMENT 16
4.1 Modes of Deployment……………………………………………...16
4.2 Data Centralization Mechanism…………………………………..19
5. DATA ANALYSIS 20
5.1 Alarm Redistribution…………………………………………..….21
6. BILLY GOAT-EFFECTS AND EXTENSIONS 22
6.1 Focus on attacker-centric monitoring …………………….……..22
6.2 Environmental Effects…………………………...………………..23
6.3 Pattern Identification……………………………………………...24
7. FUTURE WORKS 25
CONCLUSION 27
FREQUENTLY ASKED QUESTIONS 28
REFERENCES 29
CHAPTER 1:
INTRODUCTION
Recent years have brought a continued increase in both the importance of security in networked systems and the difficulty of securing them. The Internet has continued to expand, its connections have become nearly pervasive, and its protocols and services have grown more complex. Beyond the basic need for integrity, confidentiality and privacy, security has become essential toward providing reliability, safety, and freedom from liability. One of the greatest threats to security has come from automatic self-propagating attacks. These attacks include both viruses and worms. While the presence of these attacks is by no means new, the damage that they are able to inflict and the speed with which they are able to propagate has become paramount. Further increases in connectivity and complexity only threaten to increase their virulence.
A computer worm is a self-replicating computer program, similar to a computer virus. A virus attaches itself too, and becomes part of, another executable program; however, a worm is self-contained and does not need to be part of another program to propagate itself. They are often designed to exploit the file transmission capabilities found on many computers. The main difference between a computer virus and a worm is that a virus can not propagate by itself whereas worms can. A worm uses a network to send copies of it to other systems and does so without any intervention. In general, worms harm the network and consume bandwidth, whereas viruses infect or corrupt files on a targeted computer. Viruses generally do not affect network performance, as their malicious activities are mostly confined within the target computer itself.
In addition to replication, a worm may be designed to do any number of things, such as delete files on a host system or send documents via e-mail. More recent worms may be multi-headed and carry other executables as a payload. However, even in the absence of such a payload, a worm can wreak havoc just with the network traffic generated by its reproduction. Mydoom, for example, caused a noticeable worldwide Internet slowdown at the peak of its spread. A common payload is for a worm to install a backdoor in the infected computer, as was done by Sobig and Mydoom. These zombie computers are used by spam senders for sending junk email or to cloak their website's address. Spammers are thought to pay for the creation of such worms, and worm writers have been caught selling lists of IP addresses of infected machines. Others try to blackmail companies with threatened DoS attacks. The backdoors can also be exploited by other worms, such as Doomjuice, which spreads using the backdoor opened by Mydoom.
1.1 Typical worm spreading logic:
Most worms use random IP address generation for spreading to different computers. The worm sends its code as an HTTP request to the target computer. Then depending on the specific worm, it exploits the known vulnerability in it. For example, the CodeRed worm sends a HTTP request to exploit a buffer-overflow vulnerability, which allows the worm to run on that computer. The malicious code is not saved as a file, but is inserted into and then run directly from memory. There is no particular strategy used by the different worms for intrusion. One typical strategy used by W32-Blaster worm is given below.
It generates an IP address and attempts to infect the computer that has that address. The IP address is generated according to the following algorithms:
· For 40% of the time, the generated IP address is of the form A.B.C.0, where A and B are equal to the first two parts of the infected computer's IP address. Once the IP address is calculated, the worm will attempt to find and exploit a computer with the IP address A.B.C.0. The worm will then increment the 0 part of the IP address by 1, attempting to find and exploit other computers based on the new IP address, until it reaches 254.
· With a probability of 60%, the generated IP address is completely random.
· To avoid looping back to infect the source computer, the worm will not make HTTP requests to the IP addresses 127.*.*.*.
· Some fixed characteristics of the TCP and IP headers are:
1. IP identification = 256
2. Time to Live = 128
3. Source IP address = a.b.x.y, where a.b are from the host ip and x.y are random. In some cases, a.b is random.
4. Destination IP address = dns resolution of "windowsupdate.com"
5. TCP Source port is between 1000 and 1999
6. TCP Destination port = 80
7. TCP Sequence number always has the two low bytes set to 0; the 2 high bytes are random.
8. TCP Window size = 16384
CHAPTER 2:
BILLY GOAT ARCHITECTURE:-
Billy Goat is a sensor designed system, built with the specific purpose of detecting and identifying network service worms. Because of this specific focus, Billy Goat can take advantage of worm specific properties that would hinder general-purpose intrusion-detection systems, toward more efficient and accurate detection.
2.1. High Level Characteristics –
The requirements on a worm-detection system (WDS) are different from those of a general-purpose network-based intrusion-detection system (IDS). While the latter needs to detect a wide and unpredictable variety of attacks, the former can focus on specific propagation and attack strategies used by worms. Moreover, the main purpose of a WDS is to detect infected machines in the network, whereas general-purpose IDS must detect the attacks themselves.
These requirements influence the desired characteristics of such a system, particularly in the following aspects:
1. Accuracy: The goal of a WDS is the identification of worm-infected machines. To offer real utility, it must be able to perform this task with a high level of accuracy, so that its reports can be trusted by system and network administrators as the basis for contention and remediation action. A WDS can use highly-specialized techniques to detect worm-infected machines. This enables increased accuracy, at the expense of the ability to detect a wider range of attacks.
2. Speed: Given the explosive nature of modern worms, a WDS should be able to detect an infected machine as quickly as possible, to provide its users a chance to contain the damage, or even to function as the basis for an automated response system.
3. Manageability: New worms and worm variants appear almost every day, so the components of a WDS need to be updated regularly. At a systems level, this process must be automated as much as possible, to be able to deal with the monitoring of very large networks. At middleware and architecture levels, this means the base infrastructure must offer sufficient flexibility to enable the rapid creation of new detection capabilities.
4. Interoperability: Many organizations suffer from the proliferation of security tools, each with their own control, monitoring and reporting mechanisms. Furthermore, many places already have some form of monitoring console, virus-response policies and procedures, etc. A WDS should integrate as much as possible with the existing tools and processes.
5. Resilience: A WDS must operate under extreme conditions in terms of network and processing load, particularly during worm outbreaks. These conditions are more likely to induce failures than other environments. However, a WDS has a specific advantage that is not enjoyed by most other IDS’s because of the repetitive nature of worm activity; the WDS can afford to lose some data without reducing its utility. In practice, this means it is satisfactory to build a system that can “forcefully” recover from failure (for example, by automatically rebooting or even reinstalling itself) rather than trying to resist it.
6. Graceful degradation: While WDS’s may benefit from a distributed architecture, most worm outbreaks have the effect of overloading network links. It is therefore necessary for all sensors to be able to operate on their own (for example, reporting only local data). Given this condition, while the global system may be impeded, its individual sensors can still be useful during a worm outbreak.
2.2. Basic Architecture and Implementation –
Billy Goat is a worm detection system that possesses the characteristics described earlier. Billy Goat is designed to take advantage of the propagation strategies of worms. As explained earlier, most worms try to connect to IP addresses selected at random or scan entire ranges of addresses. By doing so, they can find most of the machines in a network, but they also try to connect to a large number of unused addresses. Billy Goat functions by responding to requests sent to these unused addresses, thereby feigning the existence of a large number of machines and services.
This approach has three immediate consequences:
1. The fact that the addresses are otherwise unused and not advertised means that all traffic destined to these addresses is a priori suspicious.
2. Active feigning of services, rather than the mere recording of connection attempts enables greatly improved understanding of the nature of the connection. Billy Goat is a first-person participant in the protocols, rather than a third-person eavesdropper.
3. The large number of addresses used gives Billy Goat an extensive view of the network. This enables on-box correlation of events from a seemingly diverse collection of sensors.
Instead of directly “guarding the valuables,” as traditional intrusion detection deployments do, Billy Goat guards vast ranges of “nothingness” toward understanding what goes there and why. This is similar to a honeypot. This approach, permitted by the clear focus on detecting worms, coupled with the analysis performed on the data frees Billy Goat from the high rate of false positives produced by most general-purpose IDS’s. For the same reason, it is not a replacement for other IDS’s but a complement to them. In particular, Billy Goat will not even see the traffic directed to existing machines and services, so it is unable to detect attacks against them.
Fig 1: Billy Goat internal architecture.
As shown in fig 1, at the core of Billy Goat is the virtualization mechanism and a data repository, shown in the above figure. The virtualization mechanism allows individual services to be written using standard programming models and interfaces, and respond to multiple IP addresses transparently.
This reduces the difficulty of creating new feigning services and of integrating existing ones. The data repository provides storage for IP header information and for details of the application-level information generated by the feigning servers. The feigned services offered by Billy Goat include those commonly exploited by worms. Each endeavors to offer sufficient functionality to determine accurately the nature of an attack. All the sensors except for SMB (Windows file sharing) are implemented using a specialized framework written in Java that makes it easy to create new services, and which is carefully audited for security to reduce the possibility that a Billy Goat machine could be compromised or affected by a security problem.
A particular advantage of Billy Goat is that it allows us to implement feigned services “preemptively.” For example, when a new vulnerability is announced, it is possible to predict, based on some of its characteristics, that a worm will be written based on it. In these cases, we can create a new feigning server for the protocol affected by the vulnerability, and deploy it on all the existing Billy Goats. If and when the new worm appears, it will be immediately visible to the Billy Goat application-layer sensors. This capability is particularly important in recent times, when the window of time between the announcement of vulnerability and the appearance of code that exploits has reduced dramatically, to an average of 5.8 days (as seen in the first half of 2004).
To satisfy the requirement of continued function in times of heavy worm activity, when the performance of the network may be dramatically diminished, WDS’s require distributed architectures. Each Billy Goat offers the ability to analyze and report events detect locally, thus providing graceful degradation of the detection service. At the same time the data of all Billy Goats on an intranet is centralized to assemble a more complete view as shown in the figure 2.
Fig 2: Distributed Billy Goat architecture.
The nature of the monitoring allows detection of infected machines even on network segments that do not have a Billy Goat sensor installed. Billy Goat includes extensive self-monitoring and recovery mechanisms. When a problem cannot be solved satisfactorily, the machine reboots itself. This provides increased resilience by enabling individual machines to automatically recover from failure. To support the distributed architecture, Billy Goat includes an automatic update mechanism. This ensures that each sensor is always current with respect to both signatures and software versions, and makes it easier to manage a large distributed infrastructure.
CHAPTER 3:
ENGINEERING DECISIONS AND IMPLEMENTATIONS:-
Billy Goat has been implemented with a view to use standard tools, formats, and APIs. The implementation of the system focuses on providing simple, well-documented interfaces by which Billy Goat may be integrated with existing tools and process. Many open-source components have been thoughtfully used throughout Billy Goat, and its construction would not be possible without them.
3.1. Database Tables:-
The data collected by Billy Goat at the IP tables and application layers is split into four database tables to accommodate the repetitive, and often verbose, nature of the attacks used by worms. These tables have the form shown in Figure 3 below, where solid lines indicate external keys (references across tables) and dashed lines indicate temporal proximity (the times are generated in different layers and hence may have slightly different values).
Fig 3: Database table structure used.
TIME is measured by TIME, an SQL timestamp, together with TIME OFFSET which indicates the nth event in a given TIME. The pair thus creates a unique timestamp for each event.
REPORTER is the IP address of the Billy Goat sensor that observed the event. The presence of this field is important in a distributed system.
SRC, PROTOCOL, DST, SPT, DPT and FLAGS apply to the IP layer, mapping to the three covered protocols (TCP, UDP, and ICMP).
FLAGS are void for UDP and ICMP, and the type of ICMP message is stored in both the SPT and DPT fields. Storing the three types of traffic in a single database table makes it easier to extract all the information with a single SQL query.
The full descriptions of the application layer activity, REQUEST and HOST, are expressed in XML. The hierarchical and extensible nature of XML allows us to meaningfully encode descriptions of the numerous application sensors that we use. For example, a simple UDP listener provides a greatly different data model than an extremely complicated protocol like SMB. XML allows us to keep simple descriptions simple while still allowing complex descriptions. Cryptographic checksums i.e. md5 is used for REQUEST and HOST on the corresponding indices.
Rather than native database references (external keys). This offers the significant advantage that references depend only on the representation of the database record to which they refer, rather than on the order in which they were inserted (as would be the case with traditional database external keys). This technique greatly eases data centralization in a distributed system.
SEQID is an automatically incremented value used to keep track of which events have been processed by different components. Finally, SENSOR is a short string identifying the feigning server that produced the record (some of the existing values are http, smb and dcom).
3.2. Feigning servers –
The key observation mechanism of Billy Goat is a collection of feigning servers, each covering an infection vector used by worms to propagate. Each server is to be equipped with adequate logic to accurately diagnose the nature of the connection. Real servers may have vulnerabilities in different layers, and often this requires us to write the feigning servers in a way that can detect attacks in different layers. For example, it may be needed to write the feigning server to be able to detect both low-level buffer overflow vulnerabilities and application layer vulnerabilities. In general, the servers follow the corresponding protocol up to a point that allows accurate identification of the activity, but no more.
For example:
· The HTTP feigning server accepts and records a single HTTP request, and always responds with a “page not found” error, before closing the connection.
· The MS/RPC feigning server accepts and records the first 3000 bytes (configurable) transmitted by the client, before closing the connection. This initial payload generally contains either the full code of the worm or an exploit particular to the worm.
· The SMB/Lure server is a special configuration of Samba that appears to be a badly configured machine (open shares, weak passwords, etc.). Because it is a full implementation of the protocol, SMB/Lure can often capture the full code of the worm, as they upload themselves to Billy Goat.
Majority of servers are written in Java and produce descriptions of individual interactions. The specific syntax of each record (i.e. the tree and object structure) are left to each individual server. The fact that JOX assembles the components at runtime makes the creation, debugging, testing, and deployment of new services quite easy.
3.3. Address Virtualization –
Address virtualization transparently maps the large ranges of IP address covered by a Billy Goat to the single “real” address used by the machine. This virtualization allows the Billy Goat feigning servers to be ordinary server software, written with no special consideration for the large number of IP addresses that a Billy Goat machine monitors. Address virtualization is handled by the operating system, in particular by the IP tables mechanism in the 2.6 release of the Linux kernel.
One of the mechanisms built into IP tables is Network Address Translation (NAT). Ordinarily, NAT is used to allow several machines inside a network to share a single external address. For Billy Goat, we need to do the reverse: allow a single machine to respond to a large number of external addresses. This is called as the “reverse NAT” (as shown in fig 4).
Fig 4: Tradition Network Address Translation (NAT) and reverse NAT.
CHAPTER 4:
BILLY GOAT DEPLOYMENT
During the design, implementation and deployment of “Billy Goat”, a number of properties, constructions and conventions are to be considered. They are as follows: -
· The importance of homogeneity – To operate a large network of distributed sensors, while keeping it manageable, it is imperative that all the sensors are as homogeneous as possible: no special cases, no distinct configurations. This enables automatic update of all the components and configurations. Special adjustments to some of the sensors, makes them lag behind, in terms of updates and maintenance, either because the updates fail, or because they have been disabled to prevent them from overwriting the specialized configuration.
· Centralized configuration – Even in the presence of homogeneity, it is necessary to have some configuration that is different for each sensor. This includes its network and deployment mode configuration information and local addresses that should be ignored or trusted for management purposes.
A related issue is maintaining the configuration for all the distributed sensors in a central place. This offers two advantages:
· If a sensor is completely destroyed (for example, by a catastrophic disk failure), it is trivial to restore its configuration to reinstall the sensor on a different machine.
· It becomes possible to centrally control configuration of machines, similar to network configuration schemes like BOOTP and DHCP. Based on a unique identifier, the central server can provide each sensor with its configuration information, to automate even the initial installation.
This doubly-centralized configuration offers the best of both worlds: it keeps all the sensors as homogeneous as possible, while offering the possibility of having per-sensor configuration in an automated and manageable fashion.
4.1. Modes of Deployment
The fundamental premise of Billy Goat is responding to traffic directed to unused IP addresses, as described in Section 2. However, different deployment modes can be used and combined to direct such traffic to Billy Goat.
4.1.1) Specific network ranges with static routes –
This is the “standard” Billy Goat deployment mode. A specific set of IP address ranges (which should not be in use) are designated for Billy Goat, and the appropriate routers are reconfigured to send traffic sent to those ranges to a Billy Goat sensor. The amount of traffic seen by the sensor depends directly on the size of the network range assigned. Addresses within non-routed address ranges that are not used locally may also be routed to the Billy Goat.
Advantages:
A known set of IP addresses is assigned to Billy Goat, which helps in controlling the amount of traffic it has to process. This mode of operation is well understood, and only simple configuration changes need to be made to the routers.
Disadvantages:
Large-enough groups of network addresses must be available, and assigned by the network administrator. If the assigned range is too small, the functionality of Billy Goat is limited because it cannot observe much of the network traffic.
4.1.2) ARP spoofing –
In a local network, the machine that has a particular IP address is found by using ARP (the Address Resolution Protocol). Using this protocol, machines and routers in the local network that need to send traffic to an address X broadcast the question “who has address X?” and wait for a response. If no response is received in a certain period of time, the address is considered nonexistent. This mode of operation allows a malicious host to “hijack” IP addresses in an attack known as ARP spoofing. Incidentally, this same technique can be used by a Billy Goat device to automatically grab unused IP addresses, using the following method:
· Observe ARP requests on the local network.
· If a response is not observed within a short period of time the Billy Goat sensor sends a response, effectively assigning the requested address to itself.
· Future traffic (for a certain period of time) to the spoofed address will be sent by the local router and machines to the Billy Goat sensor.
Advantages:
No previous assignment of IP addresses is needed, so the deployment effort can be very low (simply connect the Billy Goat machine to the network).
Disadvantages:
ARP spoofing is potentially very dangerous, and can cause trouble if Billy Goat attempts to spoof the IP address of an existing device. Additionally ARP spoofing only works in a local-area network, so in this mode, a Billy Goat sensor can only spoof addresses in the LAN to which it is connected. The implementation of this scheme needs to take into account the potential appearance of devices (Billy Goat must stop spoofing an address immediately when another device with the same address appears on the network).
4.1.3) Billy Goat as default route –
Instead of having specific network ranges assigned to the Billy Goat sensor, the router can be configured to forward everything to Billy Goat, except for the ranges that are being used. This essentially makes Billy Goat the default route for the network, and traffic to all the unused network segments will be sent to it. This scheme can be implemented statically (when the router has a static routing table, and the route to the Billy Goat sensor is added as the default route) or dynamically in conjunction with a routing protocol like BGP. In this second mode, Billy Goat will automatically receive all the traffic that is not covered by the current dynamic routing tables.
Advantages:
Large network coverage, and ease of configuration (the Billy Goat sensor can be configured to “spoof everything,” and it will respond to any traffic it receives).
Disadvantages:
It is potentially dangerous, particularly in conjunction with dynamic routing. In a large network, it is common that certain network segments go offline for short periods of time. If Billy Goat automatically starts responding for them, it may disturb services or automated monitoring systems in the network.
4.1.4) ICMP-based Billy –
Goat One of the most recent developments is a mode of deployment in which Billy Goat operates in conjunction with a router to provide automatic utilization of all the unused addresses outside the local network. This is how it works (see Figure 5):
· When an infected machine in the local network tries to contact a remote non-existing address, an ICMP “network unreachable” or “host unreachable” message will be sent back.
· The ICMP message is intercepted by the router local to the infected machine, which sets up a temporary route for that destination address, with the Billy Goat sensor as its next hop.
· When the infected machine, after not receiving a response, retransmits its packet, it will be sent to the Billy Goat sensor, which will respond to it.
Fig 5: ICMP-based Billy Goat.
Advantages:
This mode of operation allows for automatic spoofing of every unused address outside the LAN. This provides Billy Goat with a truly expansive view and allows it to quickly identify local infected machines.
Disadvantages:
Router support is needed to implement this scheme. The implementation also needs to be careful about removing the routes for hosts once they become active. Shunning mode one problem faced by a Billy Goat sensor, particularly when it is spoofing a very large network range, is that it can suffer from effects similar to those caused by a distributed denial-of-service attack, from the sheer amount of traffic that it needs to monitor and respond to. To limit this problem, the following technique can be used in combination with any other deployment mode: once an IP address is identified as infected, it is added to a “shun list” that causes its traffic to be ignored for a certain period of time. This reduces the overall load on the Billy Goat sensor, particularly in times of heavy worm activity, while still allowing it a complete view of the network.
4.2. Data Centralization Mechanisms –
One of the recurrent problems in intrusion detection is the transfer of data to a central location. This is technically problematic for a number of reasons:
· The common need to transfer data across firewalls. The usual unsatisfactory solution to this problem is to open additional ports in the firewall to allow for the necessary communication channels.
· With a widely distributed system, reliable transfer of data to a central server may be prone to failure, potentially causing data loss or duplication.
One of the probable tools, which can be used for data transfer, is the “BEEPLite", an implementation of BEEP, a modular, extensible protocol that allows flexible establishment and multiplexing of communication channels. One of its salient features, and which makes it particularly suitable for intrusion detection data transfer, is that it decouples the concept of connection initiator from client. This means that either the server or the client can initiate a connection, and facilitates configuration across firewalls.
BEEP additionally simplifies the addition of encryption, authentication and compression to the data flows. BEEP has gained substantial popularity in recent years, and within the intrusion detection community, is also used by the IETF Intrusion Detection Working Group in the IDXP specification.
CHAPTER 5:
DATA ANALYSIS
The data analysis is an iterative process that attempts to determine the types of activities that have been seen by Billy Goat from different source addresses in the network. This process is most valuable when done in the central server to which all Billy Goat machines send their data, because it allows discovery of global behavior that may not be visible at the individual sensors.
· In the first analysis step, a precise description of the behavior of each source IP address is constructed based on all the data gathered during the specified time period, in a data model. This includes the list of destinations contacted, the protocols and port numbers used, and all the requests sent to the Billy Goat sensors.
· The second step consists of summarizing the data, such as replacing the individual destination addresses by the number of destinations and numbers of contacts are replaced by binary orders of magnitude. This eases later identification of similarities between behavior patterns.
· The third step identifies known worms, attacks, and behaviors; thereby creating a high-level description of the data. This is done using a combination of the following methods:
– Capture of the worm itself (for example, SMB worms that upload themselves to Billy Goat). In this case, the MD5 checksum of its code is used to identify the worm with 100% accuracy.
– Observation of the exploits used by the worm (for example, an HTTP request containing a buffer overflow). Because Billy Goat is a first-person observer, it can accurately collect the full set of exploits used by a worm, and during the analysis phase, these sets can be matched with known worm’s characteristics.
– Observation of other behaviors indicative of worm activity (for example, horizontal scanning or account guessing). These are weaker indicators of worm activity in the sense that they do not make it possible to precisely identify the worm.
When precise worm identification is possible (this is, when we can give a name to the worm), the findings are labeled as “alarm” and the worm name is given. Clearly suspicious but unidentifiable findings (for example, a large horizontal scan or a single exploit that does not fully identify a worm) are labeled as “warning” and a description is given. All other data is labeled “unknown” and is available via direct query of the database.
Additional analysis steps may be introduced to add location information or other relevant information concerning a host (for example, DNS lookup, asset or vulnerability information).
5.1. Alarm Redistribution –
One important feature of Billy Goat is that each sensor detects infected machines throughout the network. By performing a centralized analysis of the data, we can build an even more complete view of the network and detect otherwise undetected phenomena. Consequently, it is possible to detect and diagnose problems in machines at locations without any IDS installed. While this is a nice property of the Billy Goat infrastructure, it only becomes valuable if the results can be disseminated to the appropriate people throughout the world in a timely fashion.
The subscription service for Billy Goat produces alerts based on the centralized Billy Goat to provide the most complete coverage. It allows individual systems administrators to self-register to receive alerts pertinent to their own network ranges, thereby ensuring that alerts are delivered to someone who can actually do something to fix the detected problems. Open registration allows the creation of a “living” mapping between network and owner. Access control is done via social mechanisms, by notifying the subscriber’s manager, on registration. The subscription service also permits flexible configurations based on a user-defined policy. The policy focuses on two aspects of the alarm redistribution:
· It can be used to set filters to restrict the alarms to some particular networks of interest, or to select the type of alarms to send.
· It controls the rate at which alarms should be sent, thereby preventing accidental denial of service effects against the subscribers. For example, systems administrators may choose to be notified immediately up to a specifiable limit whereas people concerned with global metrics may opt to receive daily summarized reports.
CHAPTER 6:
BILLY GOAT – EFFECTS AND EXTENSIONS
As conceived from the architecture of Billy Goat, it is somewhat similar to the Intrusion Detection systems. The construction and deployment of Billy Goat gives some insights about fundamental issues, behaviors and properties of worm detection systems and of large distributed systems. This section explores the comparisons and differences when compared with other protection mechanisms.
6.1. Focus on attacker-centric monitoring –
The focus of traditional network-based IDS’s (NIDS) is on the ability to detect attacks against valuable systems, such as critical servers. This “attack-centric” approach has the following implications:
· Although identifying and diagnosing the attacker may be possible, the priority is detecting the attacks.
· NIDS only need to observe traffic going to the machines they protect, and for performance reasons, they are often prevented from seeing anything else. This limits the view of the IDS.
By contrast, Billy Goat has an “attacker-centric” approach: we are more interested in identifying and diagnosing attackers (infected machines) than on identifying which victims have been attacked. This approach also has implications in the deployment of the sensors i.e. they should be placed so that they can receive as much traffic as possible. A target-centric sensor will have detailed information about the local target machines, but limited information about the attackers. An attacker-centric sensor may have limited information about the targets, but will have detailed information about the attackers. For example, a Billy Goat sensor can detect infected machines anywhere in the network, as long as they try to connect to one of the addresses spoofed by that sensor.
In a distributed Billy Goat deployment, each sensor has a global view of the network, limited only by traffic filtering between different network segments. By centralizing this information as described earlier, it is possible to have an unimpeded, expansive view of infected machines anywhere in the network. Global aggregation of data allows detection of behaviors and patterns that may not be noticeable at the local level, like very stealthy or slow scans. For example, we have seen some worms that scan the network by contacting a single address per class-C network. This would not be detected by a Billy Goat sensor spoofing a single class-C network, but is easily discernible at the global level.
6.2. Environmental Effects –
In addition to detecting worms and curbing their infection, “Billy Goat” may also have some favorable or non-favorable effects. One of the key areas deals with the effect of Billy Goat on other systems in the networking environment.
6.2.1) Network discovery –
The first aspect encountered is the interaction of Billy Goat with devices and software that scan the network (for example, asset, vulnerability or network discovery tools). Normally, Billy Goat responds to these scans for each one of the IP addresses
it is spoofing, producing wildly inaccurate results for the scanner. This can be improvised by adding a mechanism that makes Billy Goat respond “truthfully” (only respond to traffic directed to its real IP address, and for the very limited real services it offers) to a fixed set of IP addresses. This makes Billy Goat appear like a regular machine to authorized scanning devices.
Fig 6: Possible NIDS placements.
6.2.2) Network intrusion detection systems –
Another area under consideration is the sharp increase in the number of alarms coming from already deployed NIDS systems that could observe the stream of traffic flowing to Billy Goat. This stems from the fact that Billy Goat, in allowing the completion of illegitimate connections, increases the number of real, albeit harmless, attacks seen on the network. The size of the increase depends on the relative sizes of the networks, the saturation level of the network connections, the root cause of the alarm, the signatures and placement of the NIDS.
This effect was first noticed when the Nimda worm was active in a well-connected network whose Billy Goat address space was approximately 100 times larger than the number of actual hosts, and with NIDS at position 1 (see Figure 6). The result was roughly a corresponding 100-fold increase in the number of Nimda-related alarms of attacks from the Intranet against the LAN. A NIDS at position 2 would see an increase in the number of Nimda-related alarms of attacks originating from machines on the LAN (those against the Intranet and those against Billy Goat).
In the case of Nimda, and many other modern worms, the effect on NIDS 2 is augmented by the optimized propagation strategy that probabilistically favors machines with similar addresses. A NIDS at position 3 would see attacks from both the LAN and the Intranet and would have greatly increased fidelity stemming from the fact that it does not see any legitimate traffic.
6.2.3) Failure Modes –
The final area deals with the default route and Router/ICMP modes of deployment. The problem occurs when a machine or network goes down and Billy Goat automatically starts responding for it. In this case various live ness checking mechanism,
such as –
“ICMP echo (ping)”
yield deceptive results. This is especially problematic when these live ness checking mechanisms have been connected to other systems. This failure mode, induced by a relatively passive system, can be considered as an interesting warning for automatic intrusion response.
6.3. Pattern identification –
The data gathered by “Billy Goat” can be collated for creating clusters of hosts corresponding to active worms in the network. There are interesting applications of the classification of suspicious hosts by behavior types. Emergence of new clusters can be indicative of new worm outbreaks (also of network misconfigurations and malfunctions), and can be used as an early-warning system. The clusters contain detailed information about worm behavior, including infection vectors, scanning algorithms, and exploits used.
To explore the use of traditional data-mining tools, data-mining tools such as CLARAty can be used. In order to apply classical data-mining algorithms to the collective descriptions, the data model is simplified by extracting essential features. This is done by –
· Extracting descriptions of the ports targeted, along with descriptions of the application-level activity (including identification of exploits when possible).
· Adding features computed from the available data, including the order of magnitude of the number of hosts contacted, the efficiency of the scanning algorithm (whether infection attempts are directed at hosts already contacted or not), and scan density/intensity (defined as the average number of contacts per destination class-C network).
CHAPTER 7:
FUTURE WORK
Following areas have future prospects for Billy Goat development.
1. Billy Goat currently provides immediate notification to network administrators about infected machines in their domains. It also provides summarized information about infected machines and suspicious behavior, useful for higher-level management. However, all the information is presently provided in text and XML format, which requires some level of expertise to interpret. To increase the usefulness of Billy Goat data, the reporting and visualization capabilities of Billy Goat could be improved. Following are the improvisations that could be handled:
– Graphic visualization of low-level activity (IP traffic and alarms generated by the feigning servers) both globally and in each Billy Goat sensor. A visual representation of levels of traffic, for example, is very useful in quickly detecting suspicious behavior.
– Graphic visualization of summarized activity. For example, charts showing trends and statistics in number of infected machines and per-region aggregated infection data.
– High-level reports of numbers of infected machines, emerging behaviors, and most common types of worm infections.
2. It is evident that analysis of the data using clustering can produce interesting results. The use of data mining techniques can be further studied and the possibility of automatically generating signatures and metasignatures can be explored (in which multiple signatures are taken as evidence of another, possibly distributed, attack) based on the results.
3. Another interesting development based on the results of anomaly analysis of Billy Goat data is the automatic creation and deployment of feigning servers. For example, if the IP traffic data shows a marked increase in connections to a certain port where no server currently exists, a generic listener for that port could be automatically instantiated and distributed to all the Billy Goat sensors, to capture the payload being sent to that port. This could greatly reduce the reaction time in the face of a new worm outbreak.
4. To ensure accurate identification, the ideal would be to capture the actual worm code. Currently, this is done only by some of the feigning servers (e.g. SMB/Lure, MS/RPC and MS/SQL). This capability could also be incorporated in other servers. For this to happen, the feigning server needs to provide the appropriate responses so that the worm believes its exploit has succeeded, and proceeds to upload its code. In the general case this is impossible to do in a feigning server (as it would need to emulate the full behavior and all the vulnerabilities of the service being attacked). In this scenario, when a Billy Goat sensor receives a connection on a port for which no feigning server exists, it would pass the connection to a virtual machine, and observe both the network traffic and the disk of the virtual machine after the connection has ended. This would provide valuable information about new worms, making it possible to capture any type of new worm, and make it easier to construct new feigning servers.
5. Finally, having a sensor that produces no false positives for a certain class of attacks might make possible the long-standing dream of intrusion detection: an automated response system. Most such systems to date have been marred by false positives, which often result in the response system causing more damage than good. The aim is to build a system that accurately and efficiently isolates misbehaving machines, while allowing critical technical and business processes to continue unimpeded, and with an extreme focus on potential failure modes, and how they might be eliminated or mitigated.
CONCLUSION
Billy Goat has been designed to be scalable, to operate gracefully in a large distributed environment, and to provide extremely accurate detection of worm-infected machines. This paper describes a number of interesting or useful techniques and components identified during the process, of developing “Billy Goat”. It can be used as a reference model for other practitioners faced with similar problems. The paper also throws light on a number of related dependencies such as the use of cryptographic checksums as external keys to ease distributed deployment, use of social structures to enforce access to information, and to determine who needs to receive alerts.
Billy Goat is useful both for accurately detecting machines infected with known worms, through its signature-based analysis. It also detects emerging behavior and new worms, through its comprehensive view of the network and its anomaly analysis capabilities. It has the following reflections:
· The former are immediately useful to system and network administrators, which can be notified of specific infections in machines under their control.
· The latter are useful to security and network analysts interested in large-scale and new behavior of the network.
A single Billy Goat sensor deployed in a network can provide useful information about infected machines, but its real value shows in a multi-sensor environment, which provides better coverage of large networks, and where the data can be centralized and analyzed to detect emerging trends and global suspicious behavior. Thus Billy Goat will prove to become an inherent part of a network based system.
FREQUENTLY ASKED QUESTIONS
Q1) What is Billy Goat System?
Q2) What are the various characteristics of Billy Goat Systems?
Q3) Who designed Billy Goat System?
Q4) What are its Future Works?
Q5) How does Billy Goat System detect Worms?
REFERENCES
[1] James Riordan, Andreas Wespi and Diego Zamboni, "How to hook worms", IEEE Spectrum, volume 42, number 5, May 2005.
[2] Web site = "www.wormblog.com/detection/". Link - "Lessons learned from Billy Goat - an accurate worm detection system".
[3] James Riordan, Andreas Wespi and Diego Zamboni, “Lessons learned from Billy Goat – an accurate worm detection system”, RZ-3609 (#99619).
Tuesday, January 6, 2009
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment