The queries shown so far are not the only queries that can cause issues. Just imagine searching for a name. Maybe, you are not aware of how to spell it precisely. You issue a query and the result is just an empty list. Empty lists can be pretty frustrating for end users. Frustration has to be avoided at any cost, so a solution is needed.
The solution comes in the form of the so-called distance operator (<->
). It works like this:
test=# SELECT 'PostgreSQL' <-> 'PostgresSQL'; ?column? ---------- 0.230769 (1 row)
The distance between PostgreSQL
and PostgresSQL
is 0.23, so it is quite close. The distance can be between 0 (meaning identical) and 1 (totally different).
However, the question is not so much about the actual distance between two words but more about what is closest. Let's imagine we are looking for a village called Gramatneusiedl
. Now an application designer cannot expect that somebody is able to spell this tricky name. So maybe, an end user might go and ask for Kramatneusiedel
. A precise match would not return even a single row, which is somewhat frustrating to the end users.
Nearest neighbor search (KNN) can come to the rescue and return those four villages that are closest to our search string:
test=# SELECT *, name <-> 'Kramatneusiedel' FROM t_location ORDER BY name <-> 'Kramatneusiedel' LIMIT 4; name | ?column? ----------------+---------- Gramatneusiedl | 0.52381 Klein-Neusiedl | 0.76 Potzneusiedl | 0.791667 Kramsach | 0.809524 (4 rows)
The results are pretty good. Most importantly, the village the end user has really looked for is also among those results.
Just imagine an application trying to solve this without the ability to use KNN. Performance would go down the drain instantly.
KNN works also for numbers and the like. To use this functionality with numeric values, it is necessary to install the btree_gist
extension.
3.144.255.87