Now At A Billion Queries A Day, Twitter Quietly Launched A New Search Backend Weeks Ago

Jeudi, 07 Octobre 2010 01:20

Now At A Billion Queries A Day, Twitter Quietly Launched A New Search Backend Weeks Ago

E-mail

Rate this item

(0 Votes)

While everyone was busy trying out New Twitter or tweeting about how they want New Twitter, Twitter itself was doing something secret behind the scenes. The startup quietly flipped the switch on an entirely new backend for their search, they reveal in a blog post today.

“One of our main goals, but also biggest

challenges, was a smooth switch from the old architecture to the new one, without any downtime or inconsistencies in search results,” they write in the post. Mission: accomplished, it seems as no one outside of Twitter even seemed to be aware that they switched anything.

Twitter notes that they had to build this new backend because they were still using the search technology that they acquired in the Summize deal. Obviously, that tech was great at the time, but Twitter was much smaller at the time of that deal, they’ve grown massively since then. “Scaling the old MySQL-based system had become increasingly challenging,” they note.

So what is this new search? “Since we love Open Source here at Twitter we chose Lucene, a search engine library written in Java, as a starting point,” Twitter notes in the post. But they note that they had to modify it give their demands for real-time search. What type of demands? These types of demands:

Our demands on the new system are immense: With over 1,000 TPS (Tweets/sec) and 12,000 QPS (queries/sec) = over 1 billion queries per day (!) we already put a very high load on our machines. As we want the new system to last for several years, the goal was to support at least an order of magnitude more load.

Twitter says that any custom work they did on Lucene will be put back into the open source.