Topic: Autocomplete search algorithm

Hi, I'm looking for the best search algorithm for autocomplete.

I am weak at SQL and I'm hoping someone will give a hand.
I need autocomplete to suggest best matches for artists.
Thedatabase may look like this:

|Artists|
--------
John Smith
Smithers Burger
James Alsmit
Corpse,The
The corporation
Corporate Kidz

Searching for "cor" I would expect it to find "The corporation", "Corporate Kidz"
and "Corpse,The"

Searching for "the cor" It should find "The corporation", "Corporate Kidz" and "Corpse,The"
with preference to the ones containing "the"

Also If I start typing "smi" it should display "John Smith" and "Smithers Burger" and James Alsmit with preference given to the ones beginning with "Smi"

I have constructing various MySQL queries but none seem to get it right.


I would expect an algorithm like this to be almost standard on my SQL search queries.

Re: Autocomplete search algorithm

"%" is the SQL wildcard character. So something like this should do it:

Model.find .... :conditions => ['name LIKE ?',"%" + params[:search] + "%"] ....

Last edited by Duplex (2007-09-02 09:15:27)

Re: Autocomplete search algorithm

is there any way to force exact matches to display with autocomplete.

For example there are several types of rum in the database, but there is also just plain rum.

So if someone types in "rum" it shows the first 10 matches like bacardi rum, 151 proof rum etc... but it doesn't display just plain "rum"

Re: Autocomplete search algorithm

You would probably have to sort the records manually to do that ... although not sure how MySQL matches on % ... run some queries and see.

An alternative is to use a full-text index on the data - then you can use MySQL's built in text search.

Toby Hede
===================================================
FiniteStateMachine - Software Development for Social Networks
===================================================

Re: Autocomplete search algorithm

Here is what I've thought up so far.

Search: [ phrase]
  if "smi" entered. Search '* smi*',  'smi*',  'the smi*',  'smi*, the'

Search: [the]
  if "the" entered. Search 'the*'
  if "the " entered. No search
  if "the c" entered. Search 'c*',  'the c*',  'c*the'

Another thought is to only start autocompleting after say two characters are entered.

"c" : do nothing
"co : search
"the c" : do nothing
"the co" : search

It would be great if someone a little better than myself at MySQL could create a query from this.

Re: Autocomplete search algorithm

Getting further...

    # set character minimum before searching
    charmin = '2'
   
    # split phrase and place each word into an array
    @splitq = @query.split(" ")
   
    if @splitq.length > 1 # if the phrase contains more than one word
            if @splitq[0] == "the" and @splitq[1] > charmin
              [ 'artist REGEXP ?', ' ' + @splitq[1] ]
              [ 'artist REGEXP ?', 'the ' + @splitq[1] ]
              [ 'artist LIKE ?', @splitq[1] + '%, the' ] 
            # else exit
    else
           if @query.length > charmin
             [ 'artist REGEXP ?', ' ' + @query ]
             [ 'artist REGEXP ?', '^' + @query ]
             [ 'artist REGEXP ?', 'the ' + @query ]
             [ 'artist LIKE ?', @query + '%' ]