Topic: Database update/new for large text file *really* slow

I have a problem that I'm really struggling to get to the bottom of. I have a script which you feed a fixed width formatted file (I have no choice on this format - it's from an external source) and updates a large database table accordingly. If it finds an item it updates it and if it doesn't it creates a new record. Fairly routine stuff. Here's the code for the action:

  def update_foo
    @string = "UPDATE PROCESSED:<br/>"

    if params[:foo] == "5"
      if File.exist?(params[:filepath])
        File.open(params[:filepath]) { |handle|
          while line = handle.gets do

              @columns = Hash.new
             
              @columns["foo1"] = line[8..25]  # string
              @columns["foo2"] = line[26..28]  # string
              @columns["foo3"] = (line[46..56].to_f)/100  #number(float)
              @columns["foo4"] = (line[57..67].to_f)/100  #number(float)
              @columns["foo5"] = line[80..119]  # string
              @columns["foo6"] = line[120..137]  # string
              ..etc.
             
              foo = Foo.find(:first, :conditions => ["foo = ? AND bar_id = ?", @columns["foo1"], params[:bar].to_i])
             
              if !foo.nil?
                foo_update = Foo.update(foo.id, @columns)               
              else
                p = Foo.new
                for column in Foo.columns
                  if @columns[column.name] != nil
                    p[column.name] = @columns[column.name]
                  end
                end
                p.bar_id = params[:bar]
                p.save
              end
          end
        }

      else
        @string += "File does not exist!"
      end
    end 
  end


On the whole very few new records are created, so largely it's updating the table all the time. It's taking forever. I think I timed it at 555 secs for 2000 records. That's not much help whne I'm chewing on a 90,000 record file!

Last edited by svpirate (2007-07-18 10:58:00)

Re: Database update/new for large text file *really* slow

Unfortunately, you are coming to the grim realization that ActiveRecord is not optimized for high volume DB activity.

However, there are some things that you can do:

1) You may be able to use the AR::Base update_all method to combine the find and update that you're doing into one action.  Something like:

  Foo.update_all(<built set string>, <built condition string>)

2) You could possibly use some sort of bulk insert facility to handle the inserts (akin to the BULK INSERT command in SQL Server).  Unfortunately, this will bind you directly to the DB platform.  But, it will probably be fast.

3) However, you stated that most of your action are UPDATEs, so #2 might not help so much.  You could also drop down into the DBI level of things and use prepared statements to handle your UPDATEs and INSERTs.  When you're talking about 90000 executions, parsing the SQL once instead of 90000 times is a big deal.  One of my biggest complaints about AR is the lack of support for prepared statements. 

Wes

Re: Database update/new for large text file *really* slow

weyus wrote:

When you're talking about 90000 executions, parsing the SQL once instead of 90000 times is a big deal.  One of my biggest complaints about AR is the lack of support for prepared statements. 

Wes

Ahmen!!!