Hello HN!
I became frustrated with the unpredictible/poor match quality and opaqueness of "relevance scores" in existing fuzzy and fulltext search libs, so I tried something different and this is the result. The main selling point is the result quality / ordering, with best-in-class memory overhead and excellent performance being bonuses. The API is pretty stable at this point, but looking for feedback before committing to 1.0.
TL;DR
The test corpus is a 4MB json file with 162k words/phrases, so give it a second for initial download. You can also drag/drop your own text/json corpus into the UI to try it against your own dataset.
Live demo/compare with a few other libs (there are many more in the codebase, in various states of completion, WIP):
https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF...
In isolation for perf assessment:
https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF...
To increase fuzziness and get broader results, try setting intraMax=1 (core) and enable outOfOrder (userland):
https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF...
Also play with the sortPreset selector to swap out the default Array.sort() for one in userland that prioritizes typehead-ness (the resultset remains identical).
Still TODO:
That's all, thanks!
Comments URL: https://news.ycombinator.com/item?id=33035580
Points: 39
# Comments: 5
Continue reading...
I became frustrated with the unpredictible/poor match quality and opaqueness of "relevance scores" in existing fuzzy and fulltext search libs, so I tried something different and this is the result. The main selling point is the result quality / ordering, with best-in-class memory overhead and excellent performance being bonuses. The API is pretty stable at this point, but looking for feedback before committing to 1.0.
TL;DR
The test corpus is a 4MB json file with 162k words/phrases, so give it a second for initial download. You can also drag/drop your own text/json corpus into the UI to try it against your own dataset.
Live demo/compare with a few other libs (there are many more in the codebase, in various states of completion, WIP):
https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF...
In isolation for perf assessment:
https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF...
To increase fuzziness and get broader results, try setting intraMax=1 (core) and enable outOfOrder (userland):
https://leeoniya.github.io/uFuzzy/demos/compare.html?libs=uF...
Also play with the sortPreset selector to swap out the default Array.sort() for one in userland that prioritizes typehead-ness (the resultset remains identical).
Still TODO:
Code:
- Example of stripping diacritics
- Example of using non-latin charsets
- Example of prefix-caching to improve typeahead perf even further
- Example of poor man's document search (matching multiple object properties)
Comments URL: https://news.ycombinator.com/item?id=33035580
Points: 39
# Comments: 5
Continue reading...