Oct 24 2005
A Second Look at P2P Technology
Mention the words “University” and “P2P” in the same sentence, and chances are (unless you’re a student at the dorms) the phrases “RIAA Cease and Desist Letters” or “FBI Raids” will come to mind. However, there is increasing legitimate usage of P2P file sharing programs (mainly bittorrent, but also applications such as eMule) to distribute legal data while saving money. The most popular solution today is bittorrent (BT) so we’ll take a brief look at that before showing some of the ways the technology is being used.
About Bittorrent
A brief overview of how it works courtesy of the Wikipedia
“With BitTorrent, files are broken into smaller fragments, typically a quarter of a megabyte each. As the fragments are distributed to the peers in a random order, they can be reassembled on a requesting machine. Each peer takes advantage of the best connections to the missing pieces while providing an upload connection to the pieces it already has. This scheme has proven particularly useful in trading large files such as videos and software. In conventional downloading, high demand leads to bottlenecks as demand surges for bandwidth from the host server. With BitTorrent, high demand can actually increase throughput as more bandwidth and additional “seeds” of the completed file become available to the group. Cohen claims that for very popular files, BitTorrent can support about a thousand times as many downloads as HTTP.”
The BT FAQ up at dessent.net uses an analogy to explain how BT works:
“One analogy to describe this process might be to visualize a group of people sitting at a table. Each person at the table can both talk and listen to any other person at the table. These people are each trying to get a complete copy of a book. Person A announces that he has pages 1-10, 23, 42-50, and 75. Persons C, D, and E are each missing some of those pages that A has, and so they coordinate such that A gives them each copies of the pages he has that they are missing. Person B then announces that she has pages 11-22, 31-37, and 63-70. Persons A, D, and E tell B they would like some of her pages, so she gives them copies of the pages that she has. The process continues around the table until everyone has announced what they have (and hence what they are missing.) The people at the table coordinate to swap parts of this book until everyone has everything. There is also another person at the table, who we’ll call ‘S’. This person has a complete copy of the book, and so doesn’t need anything sent to him. He responds with pages that no one else in the group has. At first, when everyone has just arrived, they all must talk to him to get their first set of pages. However, the people are smart enough to not all get the same pages from him. After a short while they all have most of the book amongst themselves, even if no one person has the whole thing. In this manner, this one person can share a book that he has with many other people, without having to give a full copy to everyone that’s interested. He can instead give out different parts to different people, and they will be able to share it amongst themselves. This person who we’ve referred to as ‘S’ is called a seed in the terminology of BitTorrent.”
Real World Applications of P2P Technology
Blizzard Entertainment, a well known video game company, created their own downloader program based upon MIT Licensed bittorrent code for distributing large files such as movies to visitors of their website. This allows them to deliver high quality (DivX) movies without having to worry about slowing down their servers. Under the classic centralized model they would have to decide between downgrading their offerings or spending a massive amount of money for increased bandwidth. This allows them to reap the benefits of P2P technology while still maintaining control over how the downloads are used and not supporting a client that could be used for downloading illegal material (such as image files of their games!).
Open Office 2.0 has been received a lot of press recently as governments (mainly European?) are considering switching over due to it using an open source document type, allowing for cleaner interoperability between different programs and configurations. On the main page there is a green download window with a link to the P2P (bittorrent) downloads page above the tradional “we’ll mail you cds” page. Linux builds such as slackware have been distributed over P2P frameworks for years. Altnet makes a business out of providing P2P solutions for clients and briefly outlines the benefits of doing so. These include: Reduced Bandwidth Consumption, Load Balancing, Security (hash checks when files are completed), Scalability, Low Latency, and increased Download Speeds.
Even Azureus, a popular full-featured open-source cross platform Java bittorrent client offers you an option to download it using a torrent. ;)
There are also popular programs that rely on P2P frameworks in less obvious ways… including Skype the Global P2P Telephony Company™ (created by the makers of the infamous Kazaa) and iTunes (since it has started cataloging podcasts in 4.9). P2P file sharing can also be used as a means to distribute educational content – there is a web campaign called Download for Democracy that shares government documents that citizens may not otherwise know of or have a hard time acquiring on various popular P2P services.
Some major music download sites have survived for years by adhering to strong self imposed ethical standards (for fear of being shut down if nothing else). They often have automated artist/release checks (e.g. an audio rip of an out of print Laserdisc only live release by an obscure artist was automatically banned as there was a result matching the description/date on eil.com) as well as moderators making sure content follows guidelines as laid out by artists and record labels.
Possible Uses at UCLA
- The obvious application would be to save bandwidth and money by having a torrent option (or custom downloader à la Blizzard) for often downloaded files of a meaningful size. Examples of pages that would be benefit would be the BOL software downloads, various Computer Science application downloads, etc.
- Professors that distribute reading as .pdf files (your students salute you!) could make a torrent of either the entire quarter’s files or create one for each week. Likewise instead of having students copy or go to the library to use instructional CDs for language classes (license permitting) they could be ripped into mp3 files and put within the class website.
- Large install files or updates to be used internally. While lacking the ability to be “pushed” out, having several servers “seed” files large files that may have a lot of users attempting to concurrently download not only provides some out of the box load management but allows each workstation to contribute to distribution, helping to bypass any central bottlenecks.
- I’m sure there are more creative ways to use this technology!
Troubleshooting Resources
There is a massive amount of online documentation, and FAQs can be found on anything from music trading sites to various personal pages as well as locations previously linked to in this article. These could either be linked to or easily modified to fit usage at UCLA.
