Rethinking the Internet: October 2010

Wednesday, October 27, 2010

Scribe as an Application-Layer Multicast Architecture

The majority of multicast today is done at the application layer, because of routers not supporting multicast at the network layer. Scribe is a P2P architecture that does appication layer multicast. It runs on top of Pastry, a P2P overlay similar to Chord that provides efficient host lookup and routing. Scribe has an architecture similar to other multicast architectures—there are nodes that forward data, there are groups that are composed of a tree-like forwarding hierarchy, and reverse path forwarding is used to create the tree. Scribe and Pastry came from Rice University and Microsoft Research. Based on a paper on Scribe, it is apparent that it has certain disadvantages and advantages over IP-layer multicast.

In any higher-layer multicast system, there is going to be a tradeoff between data duplication and path length. That is, there are two ways a host can send multicast data. One option is to send it to other nodes by duplicating the data prematurely and sending it via the Internet's best-path route to each destination. This will generally decrease the transmission delay and increase transmission overhead. The other option is to send the data to hosts near the destinations, not duplicating the data until necessary. This will reduce the transmission overhead, but it will also increase the length the path the data must travel to reach its destinations. Scribe chooses the latter option, but fortunately Pastry provides an efficient routing scheme, where on average Pastry routes are between 1.5 and 2.2 times the Internet's best route. To recap, Scribe's transmission is 1.5 to 2.2 times longer than IP multicast, but it eliminates transmission overhead.

A significant advantage of scribe over IP-layer multicast is that it balances its load very well, thus allowing a large number of groups and also permits large group size. Pastry provides this advantage. The root of the tree is randomly chosen, and route selection is randomized to allow for data to traverse more nodes than if the optimal path were used.

Another advantage is that, although it only provides best-effort service, it could reasonably be altered to provide reliable end-to-end, ordered delivery. Each link in a tree is a TCP connection. Reliability would most likely be easier to implement in this architecture rather than one at the network layer, although I'm not familiar with every network layer approach.

My only question about Scribe is: Does it work well enough for video conferencing?

Monday, October 25, 2010

Network Neutrality

From a class discussion/debate on network neutrality, I was surprised at how many different perspectives there can be on an issue as simple as whether or not to regulate traffic based on the type and source of the traffic. But yet it is not too surprising when you consider how many factors influence network service providers.

ISPs dislike applications that hog bandwidth and dislike it when they need to provide service to those that don't pay for it. They favor customers that pay more for service and favor partners that make them profitable. All of these factors give ISPs reasons to bias their service.

On the other hand, the Internet thrives when restrictions are minimal. New applications may require more bandwidth and do cool things other apps cannot. But when their traffic is regulated, the Internet's success is also sort of regulated and there is no need to make it more powerful for future applications.

What is my position on network-neutrality? Part of me says that the Internet is application-centric, and ISPs should never discriminate by traffic type, and ideally there should be no need to provide differentiated services. But at the same time, ISPs need to be profitable, and traffic needs to be regulated to some extent. I'm okay with them doing this as long as they make their policies clear. That way, application designers can adapt to the changing needs of ISPs.

Saturday, October 23, 2010

Multicast

Multicast has been a large research area since the 90s, and it has presented some usesful insights into how to get a subset of hosts in the Internet to communicate effectively. Multicast has traditionally been done at the network layer. The motivation for network-layer multicast is twofold. First, it reduces the amount of work done at the application and transport layers. Second, because packets are not duplicated until they must traverse multiple flows, duplicate data transmission is avoided and unnecessary congestion is reduced.

Because of the difficulties involved with the expanding nature of the Internet and integrating multicast into routers, multicast has not always been deployed at the network level. Instead, multicast is often implemented at the application layer. While this involves more overhead, multicast is still able to be deployed. Perhaps our experience deploying it at the application layer will give us further insights that will make it possible to deploy multicast at the network layer in the future as an optimization.

Saturday, October 16, 2010

Improving BGP

It may have been considered adequate for inter-domain routing earlier, but it seems BGP is becoming less and less ideal for the growing Internet. One paper asserts that domain routing tables have grown six times in size from 1997 to 2005. About 25% of routes continually flap, and other routes take between two to five minutes to converge. A single router misconfiguration can also have a global impact on the Internet's performance.

HLP solves route flapping/convergence problems through a process using information hiding on the route advertisements, and improves isolation of problems. They claim that it reduces the number of route advertisements by a factor of 400.

This protocol demonstrates that certain techniques greatly improve the existing protocol. Yet ISPs are reluctant to deploy an entirely new protocol. Maybe these contributions would be more powerful if they were not in the form of a protocol, but rather an incremental change leading to increased performance.

Enforcing Internet Traffic Policies

Floyd and Fall's "Promoting the Use of End-to-End Congestion Control in the Internet" discusses the problems caused by flows that are not regulated by congestion control and that are not TCP-friendly. It also shows how such flows can be controlled. That is, routers can examine traffic and determine if there is a TCP-unfriendly flow, then refuse to route the flow's packets.

A search using Google Scholar shows that over 1700 publications have referenced this article. It is reasonable to assume that ISPs are aware of techniques to block greedy flows. If these techniques are used today, I imagine that most traffic regulation would be done at the edge networks, rather than at the core of the Internet. The overhead from examining a flow's traffic scales as more flows occupy the same router.

It is known that ISPs throttle certain types of traffic, but not all of their policies are advertised. Knowing policies influences how applications are engineered. For example, ISP-friendly versions of BitTorrent have been made that make both ISPs and end-users happy. The more we know about ISP policies, the more we are able to make applications that are ISP-friendly, and that also work well for end users.

Friday, October 8, 2010

Westwood+ TCP

Recently I was curious to find out what settings I could change with Ubuntu TCP, so I looked through the documentation. One option enables the 'Westwood+' algorithm. That looked interesting, so I wanted to find out why one should use Westwood+ instead of the default algorithm.

The man page says that Westwood+ has 2 advantages: 1) significantly increased fairness with Reno in wired networks, and 2) increased throughput over wireless links. With Westwood+, the sender dynamically sets SSTHRESH and CWND based on an end-to-end bandwidth estimation.

Fairness will be better because when a sender realizes that its sending rate is decreasing, it will react by more severely decreasing CWND, thus allowing for a faster rate convergence.

In wireless networks, loss is often due to interference, and Reno acts as if any loss is due to congestion. When a Westwood+ connection gets a duplicate ACK, it will take the steady bandwidth into consideration when it decrements CWND, and as a result the throughput will not fall as Reno's would.

Wednesday, October 6, 2010

Deployment Considerations for CUBIC TCP

CUBIC is a variant of TCP that was presented in ACM's 2008 SIGOPS conference. CUBIC extends and improves on the BIC variant of TCP. Like BIC, CUBIC is a variant of TCP useful in long, wide bandwidth links where the bandwidth-delay product is high. The TCP that most computers run today is not capable of reaching speeds higher than 100Mbps because transmission rate is increased linearly, with a low coefficient.

BIC solves this problem by using a high coefficient for growing the window, and then the coefficient smooths out over time. But a problem with BIC is that it uses more than its fair share of bandwidth when standard (NewReno, SACK) TCP must compete with it. CUBIC has the advantages of BIC, but also improves TCP-friendliness when in connections with low delay (~5ms). So why don't we set CUBIC as the default variant of TCP in our Operating Systems? There are several points to consider.

Incentives:

Better RTT-fairness. Because the rate of window increase (in congestion avoidance) is dependent on the RTT in standard TCP, flows with a large delay perform worse than flows with low delay. CUBIC (and BIC) improve RTT-fairness because window increase is done differently.
Better link utilization. As explained above, CUBIC will occupy a link's bandwidth much faster.
Friendliness with standard TCP in low speed (~10Mbps), high delay (100ms) links.

Drawbacks:

Unfriendliness in large-delay (~40ms), high-bandwidth paths. In one case, CUBIC takes 80% of the available bandwidth, and leaves standard TCP with 20%.
Large bandwidth-convergence time among competing CUBIC flows. With a 400Mbps link, it can require several hundred seconds for an existing connection to lower its rate while the new connection increases its rate. However, convergence time is much lower when there is less available bandwidth, e.g. 100Mbps.

These considerations are perhaps a good start on considering whether to deploy CUBIC on the Internet. While much has gone unconsidered, one can get a general picture of how CUBIC performs.

Saturday, October 2, 2010

Motivation for a Revised Transport Protocol

Recently I wrote about TCP Vegas, an interesting transport protocol that came close to being used as a replacement for Reno. Challenges come up when one tries to get a "better" protocol out to replace the existing one. A new transport protocol will thrive most when it offers something that existing protocols don't provide.

Implicit in a transport protocol's design is the way it fulfills an application's requirements. For example, TCP allows a file to be transferred in its entirety, and with the guarantee that it will arrive in the same form it is sent. Multimedia streaming transport protocols are built to best accommodate the needs of a multimedia application.

A transport-layer protocol also needs to account for the underlying architecture. TCP's assumption is that loss is primarily due to packet loss in queues at the network layer. Wireless transport protocols, such as ATP and Hop-by-Hop, are built to maximize throughput in a wireless network, where loss is due primarily to the hidden- and exposed-node problems. The key to a successful transport protocol is in how well it meets the needs of a prevalent network, and how well it improves the application at the user space.

Will TCP Vegas ever be Used?

TCP Vegas is a flavor of TCP that uses round-trip time estimates, rather than loss detection to adjust sending rates. Vegas came out at the time when TCP Reno was prevalent in the Internet, and it offered performance advantages over Reno. The inventors of Vegas discovered a 40-70% throughput increase, as well as up to one-half the loss of TCP Reno.

While the Vegas algorithm seemed like a promising alternative, it received some criticism by experts, including Van Jacobson, the inventor of the original TCP's congestion-control mechanism. Jacobson argued that Vegas violates the flow model, and that the Internet could not support Vegas. Also, Vegas was originally thought to perform worse when competing with Reno flows. And since Vegas was invented, newer versions such as NewReno and SACK TCP have emerged since then that have higher throughput and less loss.

It makes me wonder if Vegas outperforms the newer versions of TCP. If it works better than SACK TCP, and can be built to follow the flow model, then it may be still be advantageous to use TCP Vegas.