Rethinking the Internet: 2010

Wednesday, December 15, 2010

Thoughts on the Internet Security Unit

Over the past few weeks, we have researched Internet security--the more recent threats, as well as countermeasures against the attacks made. The problems we looked at ranged from Botnets to DDoS to DNS attacks to Spam attacks. The solutions included anomaly detection using a variety of approaches, reputation systems, blacklisting and cheating detection.

One of the thoughts I've had is that network security is not so much a problem of bad system design as much as it is a perpetual game between the "bad guy" and security research. Either party is rewarded by staying one step ahead of the other. The "bad guy" always has plenty of incentive to come up with a new security threat, and that gives security a good reason to come up with a countermeasure against the threat. In a sense, the good and bad guys are creating a competitive market for each other.

Also, I was previously unfamiliar with how Botnets worked. It was pretty impressive (albeit quite insidious) that a system can be designed to take command of tens of thousands of computers and use them to perform an arbitrary task. Robust distributed systems are hard to design. (Well, who knows if Botnets are designed robustly?)

We discussed several reputation systems. Collusion seems to be a persistent problem in reputation system design. It is a hard problem because colluding peers can game the system by taking advantage of good peers and rating other colluding peers well, thus damaging the accuracy of the rating system. I have read papers on reputation systems, but none of them has applied the study of evolutionary dynamics and evolutionary stable strategies to them. I read an excellent book this semester that sheds light on design of a robust social structure. It discusses the conditions that allow or prohibit invasion into a society of agents. Translating collusion into the language of the book, if a reputation system can be designed where peers have an evolutionarily stable strategy, the peers will be able to resist invasion by a colluding group of attackers.

Kalman: Another Anomaly Detection System

My previous post dealt with ASTUTE, a traffic anomaly detection system that uses the queue volume equilibrium property to detect anomalies of a distributed nature. This post deals with a different anomaly detection system known as KALMAN. The developers of this approach presented their system in the 2005 IMC. It makes use of the Kalman filter to extract normal traffic traces from a network, leaving the rest (known as the residual) for anomaly analysis.

The Kalman filter is a technique that allows for predicting future values based on past measurements that may be mingled with noise, or random variations in measurement. This filter is useful for other things, such as predicting the position of a target, given previous measurements that may be slightly inaccurate.

In the anomaly detection system KALMAN, the Kalman filter compares predictions of traffic state with the actual traffic state (with the more recent measurement data). Then, ROC curves are used to decide the accuracy of the prediction. In short, there is a traffic matrix (showing traffic at each intermediate node) and a path matrix from which origin-destination flows can be constructed. The system identifies the individual flows that are anomalous so that they can be analyzed for what type of anomaly it has.

A "Wise" Way to Detect Traffic Anomalies

Since the early days of the Internet, its security has been something of growing importance. As it is managed on many different levels (user, ISP, content provider, telephony, etc.), there have been different approaches to minimize threats. For example, users and content providers are primarily concerned with keeping their computers free of malware, to avoid compromise of personal information or resources.

ISPs take a different stance. They are concerned with keeping their clients as a whole free of malware, and free from becoming part of (or the target of) attacks, such as DDoS and spam. That is what traffic anomaly detection is concerned with. Anomaly detection is a challenging task, as there are a variety of anomalies that can occur within an ISP. Consequently there are a variety of systems that have been developed to pinpoint anomalies.

ASTUTE is a fairly recent system. Its detection is based on the assumption that, when router queues are not saturated, the total volume aggregated from different flows does not change much over time. Anomalies are thus detected based on deviations from this equilibrium. ASTUTE does well at finding when groups of flows simultaneously cause an increase of traffic, even when this increase is small. This would make it possible to classify traffic from Botnets. However, it does not do well at classifying large volume increases by one or a few flows. There are other detection methods, such as Wavelet or Kalman, that can do this. In practice, ASTUTE would work well in concert with one of these other detection methods, as each has its own benefits.

Looking at Wireless Simulations in OMNeT++

We were curious to know how well OMNeT++ (v4) simulates a wireless setting. There are several out there, such as ns-2, and Opnet. We have done experiments with ns-2 in a wireless setting, but thought it performed significantly below our expectation. Perhaps it is because we hold wireless simulation to a high standard.

The benchmark we use is a 2007 IMC performance study done by Dracos Nicolescu, titled "Interference map for 802.11 networks". A major point in his study is that there are three different ranges in an 802.11 network. Communication range defines the range in which two hosts can communicate with adequate SNR in absence of other wireless communications. Carrier sensing range is the distance at which two hosts can sense whether the other is transmitting, but they don't have the ability to send or receive data from the other. Beyond carrier sensing range is the interfering range, at which one host will transmit while the other is transmitting, yet its signal will be corrupted by the signal of the other.

So, we set out to experiment to see whether simulated 802.11 hosts in fact model those three ranges, and if so, find out at which distances the ranges correspond to. We used two different OMNeT++ packages: the most common networking one (INET), and Mixnet. Mixnet combines the upper stack of INET with a more sophisticated network interface card, which is derived from MiXiM.

Our methodology and results are included in our report. The conclusions are two-fold: First, OMNeT++ wireless simulation provides many of the aspects of the 802.11 host, but isn't entirely faithful to the results of Nicolescu's study. Second, these simulation packages are relatively young; we found adequate need for debugging, development and documentation in both.

Tuesday, November 30, 2010

The Botnet Battle

It wasn't until I did some research on Botnets that I realized what a powerful thing a distributed army of bots can be. Botnets typically consist of thousands of hosts, which are PCs and servers that have been compromised by malware. Each host in a Botnet takes orders from a Command & Control server, where inter-bot communication is encrypted IRC. Further, the malware may accomplish other evil designs, e.g., intercept keystrokes, infect other computers, etc.

Why do Botnets exist? The way I understand it, they are a source of power in the "dark computing underworld." Suppose Joe is administrator of a Botnet with 10,000 computers, and somebody wants an evil deed done (a DDoS attack, stealing credit card numbers, etc.) Then they would pay Joe to use his Botnet to accomplish the deed.

These complex systems pose a significant concern to network security, where Botnets are often hard to identify, and even harder to stop. Botnets often accomplish their purposes through sending spam, where email links point to malware or websites to DoS attack. Addressing the Botnet spam problem, Microsoft Research published a paper discussing how to recognize spam from Botnets. Their publication provides a measurement study which captures important characteristics of a Botnet. It also gives an architecture that discovers these characteristics in spam.

The way I see it, there is a perpetual battle between Botnets and email systems, and any defensive measures on the part of email systems will be opposed by additional offensive measures from the Botnet. In my opinion, the best way to approach the Botnet battle is to expose as much measurement information about Botnets as possible, and hide information about email systems.

Wireless Mesh Networking Summary

As our class has concluded the section on Wireless Mesh Networks, I will summarize what has been discussed and point out a potential area for research. WMNs has attracted significant attention to research groups, because of 1) the challenging nature of networking using a wireless medium for communication and 2) the advantages offered by ad-hoc networks.

Wireless networking poses only a few significant problems, but these problems adversely affect each layer of the network, and many solutions have been suggested at each layer. First, the fact that nodes must share the medium for communication requires that those communicating take turns in order for traffic to flow. Second, because of the hidden node problem, there is inevitably going to be loss due to interference from those outside carrier sensing range. Third, nodes withhold transmission because of the exposed node problem.

At the link layer, we have discussed techniques such as Lagrangian relaxation to allow nodes to have a fair share of bandwidth within their clique of nodes with which they share the medium. We have looked at how network coding can be used to make up for lost packets, such that the receiver is indifferent to which packets it didn't receive. We have also looked at making use of corrupted (partially correct) packets in some cases.

At higher layers, we have discussed how opportunistic routing can be used as a more reliable source of packet transmission, as routes are often broken with traditional routing on a WMN. At the transport layer, adaptive pacing can be used to allow a sender to be silent while its neighbors relay the message.

In short, the nature of wireless networking requires design adaptations from traditional (wired) networking. Many of the aforementioned techniques address the problems caused by 802.11 MAC unfairness and interference due to the hidden node problem. I'm curious to know whether any work has been done to measure the severity of the loss (lack of potential transmission) due to the exposed node problem. And if this loss is significant, I wonder what could be done to remedy it.

Friday, November 19, 2010

EEC: A Brilliant Approach

In a recent SIGCOMM publication on improving wireless communication, someone had a brilliant idea. Usually, when a packet is received by a wireless device, the packet is checked for errors. If the packet has errors that can't be recovered, it is discarded. The whole packet. The paper gives the following graph showing the fraction of bits that are typically corrupted in a wireless mesh network setting:

This shows that 50% of packets have some corruption within 0 to 2%. They claim that corrupted packets have some value, if the fraction of corrupted packets can be estimated. For example, error-tolerant applications, like video streaming could leverage corrupted packets to increase throughput if the error rate is below a threshold. Other applications could measure the error rate to determine setting selection (such as route selection or WiFi rate adaptation).

As a result, they propose a concept known as error estimation codes (EEC). The idea is to introduce a small amount of extra data to a packet used for error estimation. This is similar to error correction codes, which also introduce redundancy, but use the redundancy for error correction. The problem with error correction is that it requires a lot more redundancy to correct a "more corrupted" packet. EEC's additional redundancy guarantees estimation precision, and is the same regardless of the error rate.

Friday, November 12, 2010

Network Coding

Recently I read the paper "XORs in The Air: Practical Wireless Network Coding", and was impressed with the use of network coding. The idea behind network coding is simple. First, there is the principle that if A xor B = C, then B = A xor C and A= B xor C. Applied to a wireless network setting, suppose there are 2 hosts and a router. Host 1 has A but wants B, and host 2 has B but wants A. The router can satisfy the hosts in 2 transmissions by providing each hosts with the needed packet. Or, the router can satisfy both hosts by simply transmitting C to both simultaneously.

Network coding has large potential benefits for wireless communication, because often the bottleneck is the wireless medium. By reducing the amount of transmission by a node, more data is able to be sent. The paper I mentioned names several topologies and classifies the coding gain, or the amount by which network coding reduces the number of data packets to be sent. Some topologies provide a 1-2x coding gain, and a special "wheel-like" topology provided an unbounded coding gain. A wheel topology is where you have N hosts around 1 router. Given a sending host, every other one except for the host directly across it can hear the sender. They mention that classifying coding gain for an arbitrary topology is a hard problem though. I wonder what has been researched in that area.

Wednesday, November 10, 2010

Evaluating Wireless Mesh Networks

Wireless Mesh Networks (WMNs) have been the topic of much research over the past few years. Challenges inherent to transmitting data over a wireless medium have caused researchers to rethink many commonly held assumptions in how to implement a network. For example, the Hidden and Exposed Terminal problems are accountable for much of the collision-based loss in a wireless network. Also, the design of the 802.11 MAC is not optimal for multiple nodes to send concurrently. At the network layer, route selection is different because it depends on the quality of the signal between any two nodes, and the ability of nodes to back off while others transmit. Because most of the loss in WMNs does not come from queue overflow, important assumptions behind TCP break down.

One paper evaluates Roofnet, a test bed in Cambridge, MA. It shows that, unlike wired networks, when wireless nodes are denser, throughput often increases. It also shows that several nodes are responsible for most of the network's success. Another paper uses a simulations of a wireless backhaul network to show that there is significant unfairness when multi-hop flows compete against single-hop flows. One problem is that the 802.11 MAC has a backoff mechanism that makes it difficult for a node to transmit to a medium when another node is already transmitting. Another issue is that the hidden terminal problem causes a lot of loss from collisions.

In short, WMNs are a new and interesting, unsolved research topic. I'm interested to see what conclusions will be made after several more years of research in Wireless Mesh Networks.

Tuesday, November 9, 2010

Simulating TCP with OMNeT++

For the past several years, ns-2 has been the predominant packet-level simulator used by academics in the US. Because TCP was also developed in the West by academics, ns-2 has become a reputable simulator of TCP. In fact, ns-2 has become a sort of paragon for how TCP should behave. A SIGCOMM paper written by Kevin Fall and Sally Floyd in 1996 uses ns-2 to demonstrate the behavior of four different types of TCP when packets are dropped. They use graphs from the simulator's output to show how Tahoe, Reno, New-Reno and SACK respond to one, two, three, and four induced packet drops. Fall and Floyd's paper not only proves helpful in showing how TCP behaves with packet loss, but also can be used as a benchmark for a correct TCP implementation.

We decided that we would try and replicate the results of Fall and Floyd using OMNeT++, an open-source simulator developed primarily in Europe and the East. OMNeT++ is a growing simulator, and it is not restricted to only network simulations. Since OMNeT++ is less common in the West, we were interested in how its implementations of TCP compare with those of ns-2. We used the network modules from the INET package, and ran simulations of Tahoe, Reno, New-Reno and SACK TCP with one, two, three and four drops. A thorough description of our simulations and findings is given in this paper.

We learned that INET's Reno behaves identically to that of ns-2, that INET's NewReno and SACK behave similarly to ns-2's, but INET's Tahoe implementation has some oddities. With two or more packet drops, INET Tahoe tends to time out rather than resume a typical SLOW-START as it should. As far as OMNeT++'s ability to scale, we were able to run 1,000 simultaneous TCP transfers with the simtime:time ratio of approximately 1:170. We conclude that OMNeT++ can simulate TCP reasonably in most cases, and can scale reasonably. Also, INET Tahoe implementation should be improved.

Tuesday, November 2, 2010

Summary of Network-Layer Discussion

The last few weeks have been instructive as our 660 class has discussed a dozen or so significant publications dealing with the network layer of the Internet's architecture. From BGP replacements to efficient queueing to network neutrality to multicast, we have covered a large set of the current problems in the network-layer.

One theme that seemed to recur was that there are lots of people and politics influencing the policies that are implemented in the Internet. For example, the network neutrality debate involves a number of parties; ISPs, customers, content providers, etc. Multicast is not just implemented at the network layer because it works well; router designers must be convinced it is worth their effort to modify their product, ISPs must be willing to deploy it, and there must be demand from end users.

Another theme that I noticed is that ideas often are not deployed immediately because of the risk and consequences involved in switching to an entirely new architecture. HLP is a protocol designed to overcome many problems inherent with BGP; but there is considerable risk switching inter-domain routing protocols; any problems would affect the whole Internet. It's for that reason that research looks for accurate ways to model and simulate performance of a protocol. The IPv6 upgrade that always seems to be on the brink of the future will carry with it some negative consequences for those that cannot upgrade. Because of the difficulties involved with adopting new architectures and protocols, many of these designs serve as building blocks and ideas for future protocols.

While it is not easy to get a network protocol deployed in the Internet today, there is considerable insight from the ideas suggested in the papers we have read. I was interested in new routing methodologies suggested. Routing table size is growing rapidly, so research is looking at new routing designs that can scale indefinitely. I was impressed with the ideas that came from ROFL and Compact Routing.

Wednesday, October 27, 2010

Scribe as an Application-Layer Multicast Architecture

The majority of multicast today is done at the application layer, because of routers not supporting multicast at the network layer. Scribe is a P2P architecture that does appication layer multicast. It runs on top of Pastry, a P2P overlay similar to Chord that provides efficient host lookup and routing. Scribe has an architecture similar to other multicast architectures—there are nodes that forward data, there are groups that are composed of a tree-like forwarding hierarchy, and reverse path forwarding is used to create the tree. Scribe and Pastry came from Rice University and Microsoft Research. Based on a paper on Scribe, it is apparent that it has certain disadvantages and advantages over IP-layer multicast.

In any higher-layer multicast system, there is going to be a tradeoff between data duplication and path length. That is, there are two ways a host can send multicast data. One option is to send it to other nodes by duplicating the data prematurely and sending it via the Internet's best-path route to each destination. This will generally decrease the transmission delay and increase transmission overhead. The other option is to send the data to hosts near the destinations, not duplicating the data until necessary. This will reduce the transmission overhead, but it will also increase the length the path the data must travel to reach its destinations. Scribe chooses the latter option, but fortunately Pastry provides an efficient routing scheme, where on average Pastry routes are between 1.5 and 2.2 times the Internet's best route. To recap, Scribe's transmission is 1.5 to 2.2 times longer than IP multicast, but it eliminates transmission overhead.

A significant advantage of scribe over IP-layer multicast is that it balances its load very well, thus allowing a large number of groups and also permits large group size. Pastry provides this advantage. The root of the tree is randomly chosen, and route selection is randomized to allow for data to traverse more nodes than if the optimal path were used.

Another advantage is that, although it only provides best-effort service, it could reasonably be altered to provide reliable end-to-end, ordered delivery. Each link in a tree is a TCP connection. Reliability would most likely be easier to implement in this architecture rather than one at the network layer, although I'm not familiar with every network layer approach.

My only question about Scribe is: Does it work well enough for video conferencing?

Monday, October 25, 2010

Network Neutrality

From a class discussion/debate on network neutrality, I was surprised at how many different perspectives there can be on an issue as simple as whether or not to regulate traffic based on the type and source of the traffic. But yet it is not too surprising when you consider how many factors influence network service providers.

ISPs dislike applications that hog bandwidth and dislike it when they need to provide service to those that don't pay for it. They favor customers that pay more for service and favor partners that make them profitable. All of these factors give ISPs reasons to bias their service.

On the other hand, the Internet thrives when restrictions are minimal. New applications may require more bandwidth and do cool things other apps cannot. But when their traffic is regulated, the Internet's success is also sort of regulated and there is no need to make it more powerful for future applications.

What is my position on network-neutrality? Part of me says that the Internet is application-centric, and ISPs should never discriminate by traffic type, and ideally there should be no need to provide differentiated services. But at the same time, ISPs need to be profitable, and traffic needs to be regulated to some extent. I'm okay with them doing this as long as they make their policies clear. That way, application designers can adapt to the changing needs of ISPs.

Saturday, October 23, 2010

Multicast

Multicast has been a large research area since the 90s, and it has presented some usesful insights into how to get a subset of hosts in the Internet to communicate effectively. Multicast has traditionally been done at the network layer. The motivation for network-layer multicast is twofold. First, it reduces the amount of work done at the application and transport layers. Second, because packets are not duplicated until they must traverse multiple flows, duplicate data transmission is avoided and unnecessary congestion is reduced.

Because of the difficulties involved with the expanding nature of the Internet and integrating multicast into routers, multicast has not always been deployed at the network level. Instead, multicast is often implemented at the application layer. While this involves more overhead, multicast is still able to be deployed. Perhaps our experience deploying it at the application layer will give us further insights that will make it possible to deploy multicast at the network layer in the future as an optimization.

Saturday, October 16, 2010

Improving BGP

It may have been considered adequate for inter-domain routing earlier, but it seems BGP is becoming less and less ideal for the growing Internet. One paper asserts that domain routing tables have grown six times in size from 1997 to 2005. About 25% of routes continually flap, and other routes take between two to five minutes to converge. A single router misconfiguration can also have a global impact on the Internet's performance.

HLP solves route flapping/convergence problems through a process using information hiding on the route advertisements, and improves isolation of problems. They claim that it reduces the number of route advertisements by a factor of 400.

This protocol demonstrates that certain techniques greatly improve the existing protocol. Yet ISPs are reluctant to deploy an entirely new protocol. Maybe these contributions would be more powerful if they were not in the form of a protocol, but rather an incremental change leading to increased performance.

Enforcing Internet Traffic Policies

Floyd and Fall's "Promoting the Use of End-to-End Congestion Control in the Internet" discusses the problems caused by flows that are not regulated by congestion control and that are not TCP-friendly. It also shows how such flows can be controlled. That is, routers can examine traffic and determine if there is a TCP-unfriendly flow, then refuse to route the flow's packets.

A search using Google Scholar shows that over 1700 publications have referenced this article. It is reasonable to assume that ISPs are aware of techniques to block greedy flows. If these techniques are used today, I imagine that most traffic regulation would be done at the edge networks, rather than at the core of the Internet. The overhead from examining a flow's traffic scales as more flows occupy the same router.

It is known that ISPs throttle certain types of traffic, but not all of their policies are advertised. Knowing policies influences how applications are engineered. For example, ISP-friendly versions of BitTorrent have been made that make both ISPs and end-users happy. The more we know about ISP policies, the more we are able to make applications that are ISP-friendly, and that also work well for end users.

Friday, October 8, 2010

Westwood+ TCP

Recently I was curious to find out what settings I could change with Ubuntu TCP, so I looked through the documentation. One option enables the 'Westwood+' algorithm. That looked interesting, so I wanted to find out why one should use Westwood+ instead of the default algorithm.

The man page says that Westwood+ has 2 advantages: 1) significantly increased fairness with Reno in wired networks, and 2) increased throughput over wireless links. With Westwood+, the sender dynamically sets SSTHRESH and CWND based on an end-to-end bandwidth estimation.

Fairness will be better because when a sender realizes that its sending rate is decreasing, it will react by more severely decreasing CWND, thus allowing for a faster rate convergence.

In wireless networks, loss is often due to interference, and Reno acts as if any loss is due to congestion. When a Westwood+ connection gets a duplicate ACK, it will take the steady bandwidth into consideration when it decrements CWND, and as a result the throughput will not fall as Reno's would.

Wednesday, October 6, 2010

Deployment Considerations for CUBIC TCP

CUBIC is a variant of TCP that was presented in ACM's 2008 SIGOPS conference. CUBIC extends and improves on the BIC variant of TCP. Like BIC, CUBIC is a variant of TCP useful in long, wide bandwidth links where the bandwidth-delay product is high. The TCP that most computers run today is not capable of reaching speeds higher than 100Mbps because transmission rate is increased linearly, with a low coefficient.

BIC solves this problem by using a high coefficient for growing the window, and then the coefficient smooths out over time. But a problem with BIC is that it uses more than its fair share of bandwidth when standard (NewReno, SACK) TCP must compete with it. CUBIC has the advantages of BIC, but also improves TCP-friendliness when in connections with low delay (~5ms). So why don't we set CUBIC as the default variant of TCP in our Operating Systems? There are several points to consider.

Incentives:

Better RTT-fairness. Because the rate of window increase (in congestion avoidance) is dependent on the RTT in standard TCP, flows with a large delay perform worse than flows with low delay. CUBIC (and BIC) improve RTT-fairness because window increase is done differently.
Better link utilization. As explained above, CUBIC will occupy a link's bandwidth much faster.
Friendliness with standard TCP in low speed (~10Mbps), high delay (100ms) links.

Drawbacks:

Unfriendliness in large-delay (~40ms), high-bandwidth paths. In one case, CUBIC takes 80% of the available bandwidth, and leaves standard TCP with 20%.
Large bandwidth-convergence time among competing CUBIC flows. With a 400Mbps link, it can require several hundred seconds for an existing connection to lower its rate while the new connection increases its rate. However, convergence time is much lower when there is less available bandwidth, e.g. 100Mbps.

These considerations are perhaps a good start on considering whether to deploy CUBIC on the Internet. While much has gone unconsidered, one can get a general picture of how CUBIC performs.

Saturday, October 2, 2010

Motivation for a Revised Transport Protocol

Recently I wrote about TCP Vegas, an interesting transport protocol that came close to being used as a replacement for Reno. Challenges come up when one tries to get a "better" protocol out to replace the existing one. A new transport protocol will thrive most when it offers something that existing protocols don't provide.

Implicit in a transport protocol's design is the way it fulfills an application's requirements. For example, TCP allows a file to be transferred in its entirety, and with the guarantee that it will arrive in the same form it is sent. Multimedia streaming transport protocols are built to best accommodate the needs of a multimedia application.

A transport-layer protocol also needs to account for the underlying architecture. TCP's assumption is that loss is primarily due to packet loss in queues at the network layer. Wireless transport protocols, such as ATP and Hop-by-Hop, are built to maximize throughput in a wireless network, where loss is due primarily to the hidden- and exposed-node problems. The key to a successful transport protocol is in how well it meets the needs of a prevalent network, and how well it improves the application at the user space.

Will TCP Vegas ever be Used?

TCP Vegas is a flavor of TCP that uses round-trip time estimates, rather than loss detection to adjust sending rates. Vegas came out at the time when TCP Reno was prevalent in the Internet, and it offered performance advantages over Reno. The inventors of Vegas discovered a 40-70% throughput increase, as well as up to one-half the loss of TCP Reno.

While the Vegas algorithm seemed like a promising alternative, it received some criticism by experts, including Van Jacobson, the inventor of the original TCP's congestion-control mechanism. Jacobson argued that Vegas violates the flow model, and that the Internet could not support Vegas. Also, Vegas was originally thought to perform worse when competing with Reno flows. And since Vegas was invented, newer versions such as NewReno and SACK TCP have emerged since then that have higher throughput and less loss.

It makes me wonder if Vegas outperforms the newer versions of TCP. If it works better than SACK TCP, and can be built to follow the flow model, then it may be still be advantageous to use TCP Vegas.

Saturday, September 25, 2010

BitTorrent and its Usage

BitTorrent is a mechanism for file sharing that allows clients to download a file by sharing pieces of a file with each other. In that way, when the demand for a file is high, scalability is not an issue due to BitTorrent's independence from servers. BitTorrent has been popular among users since its invention due to its ability to provide them with important files, and as such it comprises a significant portion of Internet traffic.

BitTorrent over the Internet is an extremely complex system, and people have tried to understand its performance for several reasons. First, its quality of service to users is important in that they want to get a file in a reasonable amount of time, preferably without having to upload too much of the file. BitTorrent's performance varies for clients depending on network conditions. It causes the initial seeder to upload a large amount of the file, which may deter people from being the initial seeder. Also, clients can be engineered that download a file without having to contribute by uploading. For these reasons, people have tried to improve it.

ISPs have been concerned about BitTorrent's bandwidth consumption. Because a BitTorrent client typically uses 4 TCP connections, it tends to hog bandwidth from other connections. As a result, ISPs have throttled BitTorrent traffic. Work has been done to improve spatial locality by matching peers to other peers within the same ISP. This has greatly improved download speed for users, and been beneficial to ISPs in not requiring them to relay external BitTorrent traffic.

The more BitTorrent is improved to be reliable, the more likely it will be used for things in addition to file sharing. For example, research has been made on how to use BitTorrent for video streaming. Since video streaming is not tolerant to delay and jitter, improvements to BitTorrent can be valuable. It makes me wonder what BitTorrent will be used for as it becomes more and more reliable.

Friday, September 24, 2010

Towards Optimal Internet Video

Streaming multimedia over the Internet has been a work in progress for more than a decade. In 2001, audio streaming had been well-investigated. But with the advent of high-speed technology such as ATM and Ethernet, interest turned toward streaming video. With barely enough network speed to do it at the time, people considered how to adapt video to application layer streaming.

One survey addresses solutions to the challenges on three different levels--encoding, application streaming, and operating system support. It discusses how the data can be compressed at different levels, and how applications can choose which quality-level to stream the data at. It does not assume that video will be streamed over TCP, but shows how UDP can be used with error correction schemes. It even shows how an operating system could be changed to better accommodate a video application.

While many of the same techniques discussed in the survey are still widely used, video streaming has changed and improved since then. Due to the improvements in Internet architecture and the increasing demand for Internet video (and I will add multi-core processors), a variety of other strategies are used to effectively stream video.

Because of high demand for video from certain sites, P2P technologies are used to improve scalability. P2P methods such as those used by BitTorrent are being used to offload the stress on servers to clients and other network hosts, allowing a larger number of clients to receive video.

Most Internet video today is streamed through TCP. This is because network capacity is capable of transferring normal-quality video in a timely manner in spite of the overhead required to ensure no loss and packet ordering. It is also used because it provides congestion control; in other words, it is friendly to other data streams in the network.

But video is loss tolerant to a certain extent, and overhead could be saved in the case of streaming high-density video. Could a video transport-layer protocol be helpful that omitted the overhead required for data integrity, and had the network-friendliness of congestion control?

Saturday, September 18, 2010

Thoughts on Chord

A peer-to-peer system departs from a client-server system in that each peer has equal privileges and similar responsibilities. Most such systems lack a centralized source of control. Peer-to-peer systems have been built to accomplish a variety of tasks: job partitioning, content distribution, content lookup, file systems, and media streaming.

In 2001, a significant peer-to-peer system known as Chord was invented for the purpose of scalable file lookup. With Chord, a number of peers are organized in a large ring. Each peer stores a number of files of interest. A peer does not choose which files it stores; the files it stores are determined by its position in the ring, and on the name of the files.

When someone wants to access a particular file, they use a lookup service through Chord that can identify where a file is. Because of Chord's design, this lookup can be done in a number of steps logarithmic to the number of peers in the ring. This lookup capability provided great improvement on contemporary file lookup services, which usually required querying a significant percent of the group. With Chord, the group of peers could easily scale.

Chord's architecture is well-designed, but there are several things that Chord does not address. First is the case that one peer in the ring may have less resources (e.g., bandwidth, disk space) than another; therefore that peer will slow down file transfer for some of the files, or not be capable of storing a file. While Chord distributes files evenly among peers, it still would be preferable to keep most of the resources with the most capable peers. It is not likely that a simple change to Chord could accomplish this.

Also, peers may desire to minimize the shuffle of files between each other. That is, a peer may want to maximize on the number of its own files it stores, and to minimize on storing files it previously did not have. To make this possible, Chord could be modified to store references to files instead of the files themselves. With this modification, it remains to figure out how peers announce which files they have to begin with, and what to do when all peers which have a particular file leave the group.

Thursday, September 16, 2010

Metrics and Measurements

In 1997, Vern Paxson published a seminal paper on Internet measurement. His publication, titled "End-to-End Internet Packet Dynamics", attempted to characterize the Internet's behavior in terms of concise metrics: out-of-order delivery, packet corruption, bottleneck bandwidth, and packet loss.

Internet measurement effectively "bridges the gap" between theory and practice. Paxson's measurements demonstrated how common assumptions in hardware and protocols were often violated. That gave protocol and hardware designers the information needed to optimize existing protocols and hardware. As the Internet is a dynamically changing system, it must be continually measured in order to evaluate whether its architecture is still sufficient for its load and topology.

Saturday, September 11, 2010

Thoughts on an End-Middle-End Architecture

Many see good reasons to allow functionality to take place on the core of the Internet. Some have suggested modifying the Internet to be more friendly toward middleboxes. In their SIGCOMM '07 publication "An End-Middle-End Approach to Connection Establishment," Saikat Guha and Paul Francis at Cornell University point out several reasons why middleboxes are not only an unavoidable part of the Internet, but that they are also desirable.

Middleboxes are useful for several reasons. First, they provide a way to block unwanted packets before they reach an endpoint, and this provides defense against DoS attacks. Second, by adding functionality to the core, middleboxes allow third parties (ISPs, corporate organizations) to get partial control of and information from connections that pass through their networks. Also, by adding functionality to middleboxes, firewalls can be more accurate in which packets are blocked and which are not.

Francis and Guha in their paper identify five transport services that should be provided on a connection. Most of these services are already provided to an extent with the current Internet:

User-friendly host naming
Network-level identification of all hosts, and best-effort delivery
A way for a host to know which packet should be delivered to which applications
Blocking unwanted packets
Negotiation of middlebox usage between endpoints and networks in between

The first two are provided through the domain name system and IP addresses. Ports provide a means for the third service. The last two services is where middleboxes come in. Blocking unwanted packets is done through firewalls, and the fifth service is provided in the architecture they propose, known as NUTSS.

NUTSS allows two different phases to take place in a connection. The first is where access control is negotiated, and location of end-hosts is determined. The second is the actual transfer of the data. Negotiation takes place through policy boxes (known as P-boxes). P-boxes provide authentication for actual data flows.

In a data flow, the end-hosts will need to provide the aforementioned authentication to middleboxes (M-boxes) along the way. To enhance M-boxes' awareness of data, a fifth element is added to TCP -- the service identifier, or a global unique identifier of the type of application that is used to communicate between the hosts.

Based on what I understand about the NUTSS architecture, I am impressed with how it separates address from naming more so than DNS does. Addresses are provided by P-boxes with the assumption that they may change over time. Mobile IP could easily be implemented in this architecture.

I agree that firewalls should be more intelligent about how they block packets. However, I don't understand how adding a service id to a connection would allow a firewall to better be able to filter packets other than knowing which application that a given packet maps to. For a firewall to more accurately block packets, it may be required to assemble an entire response, which would be difficult.

A third impression I have from reading about this architecture is that there are going to be a lot of attacks made on P-boxes. If a policy box is compromised, then it could be made to grant connection access to hosts it otherwise would deny. If this architecture is be incrementally deployed, more could be said on how to secure them.

Friday, September 10, 2010

Should Middleboxes be Allowed?

One of the major design aspects of Internet Architecture is the end-to-end principle; that complexity be kept on the endpoints and that the core remain simple. The basis of this principle is that the core of the Internet needs to be kept simple to allow it to maximize data transmission. The endpoints (servers, clients, etc.) are then used to ensure that all data arrives in the right order.

While the end-to-end principle has been in force since the Internet's beginnings, the principle has been violated increasingly due to middleboxes. A middlebox is any host that sits between two communicating endpoints; i.e., somewhere in the core. What kinds of middleboxes are prevalent in the Internet? NATs, firewalls, proxies, web-caches, traffic shapers, protocol translators are all examples.

Many have looked down on the use of middleboxes in the Internet for a variety of reasons. According to one RFC, NATs cause the following problems:

They create a single point where fate-sharing does not work
They make multi-homing difficult
They inhibit the use of IPSec
They enable casual use of private addresses, causing name space collisions.
They facilitate concatenating existing private name spaces with the public DNS.

In addition, any type of middlebox other than a cache could hamper the performance of a connection. For these reasons and others, Internet architects have looked down on end-to-end principle violations.

Why then are middleboxes used? A NAT, or a Network Address Translator, is used to improve the problem of address shortage in IPv4. Firewalls and proxies block unwanted traffic. Caches improve the locality of data content, potentially reducing load on the core of the Internet. Traffic shaping improves service for certain classes of content, and protocol translators are necessary with the incremental deployment of IPv6. While it may be possible that the Internet could be redesigned to obviate their need, middleboxes are necessary given today's architecture.

Saturday, September 4, 2010

DONA's Implications

A recent publication proposed a new architecture called DONA (Data Oriented Network Architecture.) The publication argues that users have several expectations that the Internet does not efficiently fulfill.

First, users expect that naming of resources be persistent. They don't like the disappointing 404 indicating that a resource has been moved, or that content has been re-hosted elsewhere. To remedy this problem, HTTP has the redirect functionality. However, this may not be the most efficient way to do this.

Users also rely on the availability of data, in that their data must be there, and be quickly available. A single server is not always adequate to supply a resource, when that resource is extremely popular. For that reason, CDNs and self-scalable file-sharing protocols like BitTorrent have been created. The Internet was designed to associate data to a location, and this is not consistent with such systems.

Third, users want their data to be authentic; unmodified, and having come from a reliable source. Today this is done by securing a channel, allowing two hosts to communicate securely.

DONA improves how the Internet achieves these goals. It introduces what it calls "flat, self-certifying names". What this means is that names for resources are no longer tied to addresses. Instead, they use principals, which have public/private key pairs, and allow for data authentication. Using these names allow for a resource to be queried independent of the location of the resource. This allows for content to be moved, and to be located in multiple places. In that way, data is persistent, available, and authentic.

Requests for data are routed by name, rather than by host. To facilitate this, there are a number of resolution handlers (RHs), entities that keep track of where data should be routed. This system obviates the need for DNS, speeding up the initial communication process.

With DONA explained, it leaves me with several questions. First is its scalability. The Internet has a large number of resource names; I imagine that the RHs will fill up and take a long time to look up where to route a request for a resource. Secondly, there may be downsides to decoupling a resource from a provider. A resource only need be unique within the provider's list of resources. This reduces the number of names that the Internet must account for. Third, if DONA is to replace the current system, can it be deployed incrementally and seamlessly without affecting the rest of the Internet?

Internet Design Principles: Anything Left Out?

In a graduate networking course at BYU, we review the design philosophy of the Internet. By the "Internet", I refer to the architecture and protocols under which today's applications and services (web browsing, file sharing, media streaming, communication, etc.) are run. The success of the Internet is evaluated in terms of how well its design goals are met.

So what are its design goals? That has depended on what people have termed they should be over the course of the Internet's history. It has been around since the '70s, so the goals change over time due to advancements in hardware, and due to new apps that are invented, and can be supported by the Internet at the time.

Many of the design principles still in the Internet today were identified by David Clark in 1988. The primary goal he mentioned was to allow communication between different types of networks. Other goals included fault tolerance, support for multiple types of service, accommodation for a variety of networks, distributed resource management, cost effectiveness, ease for computers to attach to the Internet, and accountability for resources.

An additional design consideration since then has been security. Since then, the Internet has been made available for world-wide use, opening up many vulnerabilities relating to privacy and authenticity. Much security research has resulted from this need, and security continues to improve today.

So we have Clark's design goals and security. Are there additional design goals that the Internet has today? By and large, the above goals are still goals for today's Internet. Perhaps we should examine some of the recent applications. Television and video streaming is becoming a big thing. Perhaps the Internet could make quality of service a bigger priority.