Main Menu
Home
About
Archive
Woot Alarm
apt-fast
Zen Kernel
Downloads
Satellite
Dish Keys
SURGE
Links
Search
Search Bible
Feed Me!
 

 Subscribe

Add to Google

Add to Pageflakes

Subscribe in Bloglines

Add to My AOL



 







Internet Archaeology Article
Monday, 30 July 2007
Welcome interested people, 2600 Readers, and everyone else! I couldn't wait, so here is the article!

Published in 2600 - The Hacker Quarterly, Summer 2007

By ilikenwf (A.K.A. Matt Parnell) 

     Archaeology is a term that describes unearthing an artifact that is old, long lost, or forgotten. The internet is no different from the real world in the sense that it too has artifacts of media from days gone by. You just have to know where to look. The best place to start is the Internet Archive "Wayback Marchine," which houses
over 8 Petabytes of old information gleaned from the earliest days of the Internet up to now. Just put in an address, and you can view a site, provided it was indexed, all the way back to 1996.
               
Beginning Methodology

     I had wanted to find as much "lost" TechTV and ZDTV media as possible, for nostalgia's sake. Starting out, I just was viewing the sites by individual archive dates. This was way too tedious and time consuming to be worth while, and it didn't really give me much to work with. Digging around on the archive's information pages, I discovered that searching sites with wildcards (*) is supported. To give it a shot, I typed in "http://www.techtv.com/*", as well as "http://www.zdtv.com/*". These searches yielded long lists (45,000+) of pages from the two domains. At first it was really slow to sift through the information, until I found a way to speed it up - go to the bottom of the search page, and set the number of results displayed to 30. Then, when the page reloads, the url will look like this: "http://web.archive.org/web/*sr_1nr_30/http://url.com/*". Just change the 30 to a reasonable number that won't cause your browser to crash and load the page from your edited url. The list will be much larger, therefore you don't have to click "next" over and over again. Then, scroll/pagedown through the content looking for interestingly named files, and files with uncommon extensions, like pdf, psd, zip, etc. Find one, click the link, and if there is only one copy of that file in the archive, it will pop right up unless it was indexed incorrectly. Otherwise, you will get a choice of dates the file was archived on. Choose the first one. Keep working through the dates until you find a good uncorrupted copy of the file (see tips and tricks section for expanation).


Subdomains

    The problem with this method is that it doesn't search all of the subdomains of a top-level domain address. To do this, either use a whois search, look at the web pages' (html, php, xml, etc.) sources and look at the paths. Using a combination of these methods, as well as my memory of the sites, I stumbled across subdomains like cache.techtv.com , chat.techtv.com , and more. You can see a list of the domains I found by clicking here.

See The Findings
    Using the above methods, I searched other domains and found all sorts of stuff - a font of Cat's handwriting, psd and eps source images for many of the show's logos, lots of wallpapers, avatars from the old ZDTV chat palace, among other things. I also found many video and sound clips from the old "Fox Kids" television network on the archived copies of "foxkids.com..." All in all, I was very successful, and very pleased. You can grab a copy of my discoveries from the links at the bottom of the page.
 
Practical Uses

    These methods can all be used for good or evil - you can see the inner workings of sites that have, since archiving, locked down areas that were once pulicly open. Sometimes, you can even find media that was free, but is now charged for, thus saving you money. In truth, the sky's the limit! Have fun!

Some Tips and Tricks:

1.These methods WILL give you files other than "web only" files, such as executables, zip files, and video files.

2.One problem is that some of the zip files and exe files get garbled and corrupted during transfer to the archive (especially on older pages) and don't always work. You can sometimes repair the zip files, but many times it doesn't work. Try finding another archive date with the same file. If you can't, it is best to move on.

3.Take note that you aren't really supposed to download from the archive. People do it anyway, but you really should make sure that you don't sell the material you find, and use it for "educational" and "archivial" purposes only.

Findings:

These are hosted on Megaupload so that my site doesn't crash from bandwidth overuse. Links are now working again. These links will be updated as needed, both here and on my Downloads page.

TechTV Archaeological Findings (RAR)

Fox Kids.com (and other related domains) Archeological Findings (RAR)

Shoutz:

For what it's worth, shoutz Adrian Lamo at 2600, as well as Greg, Hevnsnt, CodedChaos, Surbo, and all the other guys at I-Hacked, and the Edge. Have a good time at Def Con, you lucky jerks!
Comments
Add New RSS
Write comment
Name:
Email:
 
Website:
Title:
 
:angry::0:confused::cheer:B):evil::silly::dry::lol::kiss::D:pinch:
:(:shock::X:side::):P:unsure::woohoo::huh::whistle:;):s
:!::?::idea::arrow:
 
Please input the anti-spam code that you can read in the image.

3.25 Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved."

 

© Matt Parnell's Brain: Plugged In!