3

I'm searching for an alogrithm that takes two strings, a query and a string that is to be searched for the query. The algorithm should result in a 'found' when the string contains the characters of the query in the right order but with any amount of other characters inbetween.

Example: query:'abc', string to search: 'xaxbxc' results in 'found', because 'xaxbxc' contains the characters 'a', 'b' and 'c' in that given order.

Furthermore the algorithm should be able to highlight the found query in the searched string.

Intellij IDE such an algorithm for searching files. Given a file 'settings.ts' Intellij File search (Ctrl+Shift+N) gives the following results for the given search input.

enter image description here

A naive solution would be to start with the first character in the query and loop through the string to be searched until the first query character is found. Then go to the next query character and search for it in the searched string starting from where the first query character was found and so on. If all query characters were found yield true.

This results in two problems:

  1. It should be very slow compared to Boyer-Moore-Search, but i can limit the strings to be searched using triplet indexing as Intellij does
  2. the highlighting is "wrong": the stated algorithm would correctly identify 'settings.ts' for the given query 'sets' but would highlight the wrong way (highlighting of the given algorithm: settings.ts), Intellij highlight as seen above: settings.ts

In other scenarios this is much more dramatic: given query: 'hours', search string: 'The sun rises above the mountain in the early morning hours' this would highlight:

The sun rises above the mountain in the early morning hours
 ^              ^         ^               ^               ^

instead of:

The sun rises above the mountain in the early morning hours
                                                      ^^^^^

I suspect some kind of levensthein-esque algorithm to get the highlighting correct by penalizing single characters in highlighting with a higher cost.

Does someone know what algorithm Intellij uses for search highlighting? Thanks in advance!

0 Answers0