
‘In this paper, we focus on the problem of searching sorted, inmemory datasets. This is a key data operation, and Binary Search is the de facto algorithm that is used in practice. We consider an alternative, namely Interpolation Search, which can take advantage of hardware trends by using complex calculations to save memory accesses. Historically, Interpolation Search was found to underperform compared to other search algorithms in this setting, despite its superior asymptotic complexity. Also,Interpolation Search is known to perform poorly on nonuniform data. To address these issues, we introduce SIP (Slope reuse Interpolation), an optimized implementation of Interpolation Search, and TIP (Three point Interpolation), a new search algorithm that uses linear fractions to interpolate on nonuniform distributions. We evaluate these two algorithms against a similarly optimized Binary Search method using a variety of real and synthetic datasets. We show that SIP is up to 4 times faster on uniformly distributed data and TIP is 23 times faster on nonuniformly distributed data in some cases. We also design a metaalgorithm to switch between these different methods to automate picking the higher performing search algorithm, which depends on factors like data distribution.’
(tags: papers pdf algorithms search interpolation binarysearch sorteddata coding optimization performance)
Links for 20190515
This entry was posted in Uncategorized. Bookmark the permalink. Trackbacks are closed, but you can post a comment.