Topic: Creating a custom search index?

Basically I have a directory of images that have tags in their comments metadata field.  I need to be able to search through these tags, but opening and parsing the tags for every image is too slow.  So I thought I could create a database table as a sort of index, it would basically be two fields, tag and image location.  This seems like it would work fine, but then I realized for it to be fast, I would probably have to add a database index on that table.  This seems a little clunky, and would basically duplicate the entire table.

Does anyone have a better idea?  Would it be worth it to research hashing algorithms and come up with my own solution?

Re: Creating a custom search index?

How many images are you thinking about? I like the idea of storing the metadata in a table, it will be much faster than reading it from each individual file. You may not even need an index on the table.

Railscasts - Free Ruby on Rails Screencasts

Re: Creating a custom search index?

Probably about 1000 images to start.  I guess I could do some benchmarking to see how much indexing the index would help.  I guess I was hoping for something really slick. 

I came up with the idea of serializing a ruby hash to a flat file right after I posted.  Here's what I came up with.

class HashSearch
  def initialize(i)
    @index = i
  end

  def search(q)
    terms = q.split /[\s,]+/
    results = Hash.new(0)
    for term in terms
      if pointers = @index[term]
    for pointer in pointers
          results[pointer] += 1
        end           
      end
    end
    return results.sort { |a,b| b[1] <=> a[1] }.collect { |a| a[0] }
  end
 
end

index = Hash.new
1000.times { |i| index["tag#{i}"] = ["pointer1", "pointer2", "pointer4", "pointer3", "pointer5", "pointer6" ] }
hs = HashSearch.new index
hs.search "tag1"


I added weighting, so the more tags a pointer shows up under, the higher position it has.  It seems to run pretty fast.

I'll play around some more and see what's fastest.