Topic: Arrays vs Queries in loops?

I am a newbie to RoR and am porting a web app I wrote from Delphi. One of the main program functions requires many loops over subsets of data. Delphi offers a range method on a client dataset. Essentially, I can restrict the range of records viewed from a query, resetting the range parameters on each pass. They are progressively reducing the number of records visible but will often be reset to the full view and begin the loop again.

My question is, would it be faster to use search_by on an array of hashes

(from games = Game.where( league_id => #{league_id})

{"games"=>[{:game_date=>"7/15/2011", :round_number => "1", :team_1"=>"Aces", :team_2 =>"Blues" },
                     {:game_date=>"7/22/2011", :round_number => "2", :team_1"=>"Aces", :team_2 =>"My Team" }]}

games_subset = infoHash["games"].select {|k| k["round_number"] >= #{a_past_round_number} AND k["round_number"] <= #{a_future_round_number} }

Alternately, I could query the db for the games with the desired round numbers. This could occur hundreds of times under some circumstances before I break the loop or it completes.

Last edited by markhorrocks (2011-07-20 02:28:00)

Re: Arrays vs Queries in loops?

I cannot tell you because it depends on many factors.  However generally in-memory operations are much faster than RDBMS ones.  Built-in ruby methods (such as select) are usually well optimized and run fast, a few hundred of those should be  well within bounds.

Be aware, though, that one of Ruby's weakest points speedwise is garbage collection; if you do run tons of these (create-array-and-then-drop-them), you may consider using REE which is somewhat optimized in this regard.

Re: Arrays vs Queries in loops?

Thanks for the tip. I also need to get an array of teams and then calculate some data and update 2 keys in the array, then sort the array by the keys. Is there any preference for sorted = teams.sort_by instead of teams.sort_by! ?

Re: Arrays vs Queries in loops?

I'm not sure about the internals of Ruby.  The !-versions work in-place (that is they modify the operand instead of returning a copy) and therefore theoretically can be implemented without having to allocate more memory.  I'm not sure, though, how this is coded internally.

A good general advice is that you should not optimize early on.  Do the thing that feels natural and when you're done check and go after the bottlenecks -- if performance is not satisfactory.