Five days ago, Dropbox launched its first collegiate Space Race, and gave us MIT folk a chance at eternal glory. To nobody’s surprise, it was only a matter of hours before we had achieved a landslide victory. With numbers equivalent to 75% of our student body quickly joining the fun, there was no way other universities could come close to our Dropbox dedication. Students proudly wore their fresh-pressed, Career Fair-issued Dropbox shirts all across campus, and walked with a new swagger in their step. MIT students aren’t ones to brag, but for once our unity had truly accomplished something great, while other campuses were probably off making another Gangnam Style parody.
But then disaster struck. Just one day later, international universities with larger student bodies pulled ahead in the rankings and left us in the dust. Dropbox’s founder and MIT Alum, Drew Houston, tried to lessen our emotional damage by creating a “United States Leaderboard” where we still held the #1 position, but the damage was already done. Our swagger was replaced with a limp, and some students even started creating an MIT Gangnam Style parody.
All hope was lost, but a number of student groups refused to give up. With unparalleled determination, these students worked through the night as they slowly pieced together a solution. Tonight, one of those solutions was put into action, and MIT once again became #1 in the world. Here’s how it worked:
Post Mortem Note: The blog post below was written with a cheerful sense of optimism before we ran everything together. Although everything below has now worked, the issues we hit during deployment are interesting and warrant another blog post. Disclaimer: please don’t try this at home! It’s left one of us without access to MIT’s network.
First things first, we needed a way to automatically generate lots of email addresses. They didn’t actually need to be MIT email addresses for us to earn points, but what fun would it be if they weren’t? We’re fortunate that MIT keeps a relatively open network, so students are able to create their own @mit.edu mailing lists without any approval. The obvious solution was to create mailing lists directly in the terminal with blanche but unfortunately only administrators can create lists through the command line.
That meant we were stuck doing it the old-fashioned way, through a GUI web interface. Most of MIT’s utility websites (checking registration, grades, billpay, etc) require special MIT certificates to access. Usually when a web developer wants to secure their website, they will purchase an SSL Certificate from a Certificate Authority, and the common Certificate Authority signatures (GoDaddy, Verisign, etc) are built directly into the browser. Then, the developer will create a login system to determine identity. MIT does things completely differently. MIT is it’s own Certificate Authority which isn’t included in browsers, so students must first download the MIT Certificate Authority. Then, instead of using a login system for identity, students download an MIT X.509 Personal Certificate to prove their identity to MIT web services.
While it’s relatively easy to implement a web crawler that uses standard SSL, creating one that uses MIT X.509 certificates is notably more difficult. We spent a couple of hours trying to figure it out until someone came up with a better solution. In general, using certificates is the easiest way to access all MIT web services, so we all assumed it was the only way. Then we remembered that some services, including mailing list administration, are also also available with our student login information through Touchstone. http://ist.mit.edu/touchstone-detail
This meant we could programmatically step through mailing list creation using standard SSL authentication. We used a headless browser called Mechanize to automate the creation of mailing lists. But before actually creating the lists, we needed to come up with names for them.
We decided that total, we’d like to register about 30,000 Dropbox accounts. Our names needed to look realistic, but also be random enough that they couldn’t stop us with simple pattern recognition. We decided to take the list of all currently registered email accounts at MIT and add some random characters to them.
The next issue was that we asked MIT’s network administrators how we could programmatically create lists, and they told us we couldn’t. Making 30,000 mailing lists the next day probably wouldn’t look too good, so we came up with an alternative. Instead of creating 30,000 mailing lists, we’d only create 1,000, then rename them after the Dropbox account is registered. When the process was complete, we’d delete the 1,000 lists to effectively leave no trace.
Now that we had a way of getting our mailing lists, we needed a way to actually register those emails with Dropbox. Submitting the registration form was easy, we just used Mechanize again to automate the browser interaction. The next step was to automate clicking the verification link that Dropbox emails to new users.
We’re using mailing lists, and there isn’t any way to directly check a mailing list’s email. Instead, we added the same Gmail account to each mailing list. We also made sure that each mailing list was private, so no rogue MIT interns at Dropbox could tell what email account we put on the list, or which MIT account created the list.
We created a process that would poll the Gmail account every 30 seconds using Python’s imaplib. Then, we used Mechanize again to click the link. At this step, Dropbox also requires that the new user re-enter their password. We created each account with the same password, so we just enter that password with Mechanize and the account is verified. This step brought up my favorite conversation through this whole process:
“Crap, they’ll be able to tell that we’re using the same password for each account, and stop us that way.”
“Wait no, they won’t be able to tell if they’re hashing their passwords properly.”
Unfortunately, just verifying an account isn’t enough to actually get us any points. Dropbox requires that the new user install the Dropbox client before awarding any referral points. This is where things started to get more tricky.
This has to completely automated, so we decided to use Linux, where we figured it would be easiest to automate installation. Our first test was to uninstall Dropbox, then reinstall it with the new user. The installation worked fine, but when we tried to get the referral points we were told, “This referral looks sketchy.” Busted. They must have put a flag somewhere on our computed to indicate that we had installed Dropbox before.
So we set up two fresh Linux boxes. On the first, we installed Dropbox but never logged in with a user. On the second, we installed Dropbox and logged in with a user. Then we diff’ed the two installations in hopes of finding the flag.
Meanwhile, someone else went to Google and discovered that all we needed to do was change our MAC Address. That was easier.
After doing some research on how we can spoof our MAC Address, we came across a utility that we could use while installing Dropbox from the command line. Great, now we we’re making some real progress.
We needed to run the Daemon without any environment variables to prevent a GUI from opening. In the command line, Dropbox outputs a URL that we’ll use to link our registered account to this installation of the client. We open that URL with Mechanize to link the account, and our referral is complete.
But we didn’t stop there. A simple referral only gets us one point, for more points the new account actually needs to start “using” Dropbox. For the purposes of getting referral points, this means completing Dropbox’s “Getting Started” wizard, and sharing a file with someone else. This involved need the csrf token from the page and watching the ajax requests that were made to complete the steps online. The token was then taken from the cookie of the Mechanize browser that was used to log the user in.
We used Mechanize against to walk through the online Getting Started wizard. Then, we ‘touch’ed a file into a new folder within Dropbox, and used Mechanize to share the folder through Dropbox’s web interface.
Now we had all of the individual components running, but we still needed to figure out how they would work together. The process of creating a single account actually takes a few minutes because of delays while we rename mailing lists and wait for emails from Dropbox. We needed to have a number of queues, and processes adding and removing from the queues. We couldn’t bare to run all this “space race” complication from a generically named server, so we decided to call ours “Houston.” Pronounced like the city in Texas, not the street in New York, and certainly not Drew Houston’s last name. Although, we do get a chuckle from the idea of someone calling Drew and saying “Houston, we have a problem.”
Our application effectively used 6 different queues and 4 different processes.
The 6 queues are:
- unused_usernames – Contains a list of all the usernames that we will register on Dropbox
- available_mailing_lists – Contains a list of the mailing list names that aren’t currently undergoing registration. It starts with the 1,000 mailing lists initially created which are randomly named.
- awaiting_registration – Contains a list of functioning mailing lists that will be registered
- registered – Contains a list of functioning mailing lists that have been registered
- verified – Contains a list of functioning mailings that have been verified
- winners – Contains a list of all registered, verified, and installed mailing lists.
The four processes are:
renamer takes a mailing list from available_mailing_lists, and renames it to one of the unused_usernames. It then removes the entries from both available_mailing_lists and unused_usernames. Now we have to wait a few minutes for the changes to take effect, then we add the new mailing list name to awaiting_registration.
registrar takes emails from the awaiting_registration list and uses Mechanize to walk through Dropbox registration. Once complete, it moves the name from awaiting_registration to registered.
verifier constantly polls the Gmail account to look for verification emails. When it sees one come in, it completes the verification with Mechanizer then finds the name within registered and moves it to verified. Because we no longer need the email to function at this point, we re-add the name to available_mailing_lists to be renamed again.
installer takes names from verified and runs a Dropbox installation for that name. It does this with fakemac on a Virtual Machine in AWS. After installation, it walks through Getting Started and Shares a folder for additional referral points. Finally, it moves the name from verified to winners.
a btb “So I got to South Station, pulled out my laptop, and…” comm.prod