Files
Abstract
Powerful search capabilities are fundamentally important for popular micro-blog platforms such as Twitter. While recently there has been some work aimed at enhancing the scalability of search, very few existing techniques incorporate personalization into their search and ranking process. However, personalization also raises new scalability challenges because it adversely impacts the effectiveness of caching and other performance enhancement techniques. We present REPLETE: a scalable, real-time search framework for micro-blogs that incorporates personalization by analyzing the follower relationships in Twitter. Three unique aspects characterize the REPLETE framework. First, our technique takes into account both the search parameters as well as the distances of the follower relationships when ranking the search results. Second, we design an in-memory temporal index that preserves the temporal significance of tweets and helps speed up query evaluation. Third, to ensure overall quality and timeliness of search results, we identify important tweets and prioritize their indexing.