A distributed index setup for Sphinx
Posted: February 9th, 2010 | Author: Thijs Oppermann | Filed under: Sphinx search | Tags: search, setup, sphinx | No Comments »Sphinx search is a powerful search engine. Recently we released it (version 0.9.9-rc2) as the backend for most of the searches on one of our high-volume websites. This site has about 360.000 visitors a day that generate about 4.500 search queries for the Sphinx backend per minute on average, peaking to nearly 9.000 per minute when it gets busy on the site. To be able to handle that many requests we currently run Sphinx on four dedicated servers.
A problem with having more than one sphinx server is that you need to make sure the results from the different server are close to the same. Since it is possible to switch between servers for two consecutive searches (which on the site in question could also be a browsing action, for example moving from one page of results to the next) it could be very confusing if the search result were different.
With Sphinx there are a number of ways to solve this problem. The most commonly used solutions are:
- run the indexer on one server and make those indexing results available to all the other servers (through scp, rsync, or hosting on a shared filesystem)
- using a distributed index setup
The first should work, but is actually not recommended by the makers of Sphinx. We went for the second solution: a distributed index setup.
Read the rest of this entry »