Our client is a digital content provider. The content types include ebooks, audio books, emagazines, software, music albums and music tracks. Each of the content type has millions of items available. Search is the most convenient way for users to find desired content on the website. Sphinx search was already implemented for the search. Due to lack of knowledge of Sphinx, clien’t team were not able to index some content types as there were so many items in that. High load on Mysql server during the indexing was another issue. The search queries used were kind of basic and were not producing required results. Once we were hired, we performed configuration and performance audit for Sphinx Search implementation. Due to our expertise we were able to quickly identify the issues and bottlenecks in the implementations. We rewrote some source and index definitions. We introduced indexing by ranged queries so the indexing queries complete quickly. We distributed larger indices in to multiple and did indexing in parallel. Also implemented delta indexing for required indices. The previous implementation was using older Sphinx version, so we upgraded to latest version of better performance. Also performed relevancy tuning to improve search results. Introduced expression sorting which allowed to utilize factors like publishing dates in overall scoring. The result of these efforts was more accurate and reliable search for all the content types. This automatically resulted in better conversion rate from search pages. The search can be seen in action here.


Waseem is consultant for Elastic Stack. He is Elastic Certified Engineer. Has years of experience with Elasticsearch, Solr, Wazuh, Sphinx Search, Manticore Search, OpenSearch and full text searching.