Topic: Help with some RegExp's

I'm having trouble with two regular expressions and was wondering if anyone here can help:

This one works, but is more brittle than I'd like:

  def get_name_from_label(label)
    regexp = /<label(.*?)for="(.*?)"(.*?)>#{label}<\/label>/
    md = regexp.match(response.body)
    assert false, "No label found with for=\"#{label}\"."  if md.nil?
    regexp = /<input(.*?)id="#{md[2]}"(.*?)name="(.*?)"(.*?) \/>/
    md = regexp.match(response.body)
    assert false, "No input found with linked to visual label:\"#{label}\"."  if md.nil?
    md[2]
  end

Line 5 (the input regexp) works, but relies on the fact that all the rails helpers output the id before the name.  Is there a good way to allow either order and still be able to pull out the name field?

The more confusing one to me:

  def get_destination_for_button(button_text)
    regexp = /<form(.*?)action="(.*?)"(.*?)>(.*?)<input(.*?)type="submit"(.*?)value="#{button_text}"(.*?)>/
    md = regexp.match(response.body)
    assert false, "No button found with button text:\"#{button_text}\"."  if md.nil?
    md[2]
  end

I thought that the regexp here should match input of this sort, but it fails.  I've tried making the 4th group greedy in case it was stopping on the first input field, but that still failed.
<form action="/account/new_account_step_one" method="post">
<!--[snip out form contents]-->
<input name="commit" type="submit" value="Next Step" />
</form>

Any advice?

My RoR journey  -- thoughts on learning RoR and lessons learned in applying TDD and agile practices.

Re: Help with some RegExp's

I know you were having trouble with assert_select, is that the reason you aren't using that? Regular expressions aren't the best at searching HTML like this.

NielsenE wrote:

regexp = /<input(.*?)id="#{md[2]}"(.*?)name="(.*?)"(.*?) \/>/

Line 5 (the input regexp) works, but relies on the fact that all the rails helpers output the id before the name.  Is there a good way to allow either order and still be able to pull out the name field?

You may want to split this into multiple regular expressions. Something like this:

response.body.scan(/<input.*?>/) do |input_tag|
  assert(input_tag =~ /\bname=".*?"/, "...")
  assert(input_tag =~ /\bid=".*?"/, "...")
end

NielsenE wrote:

  def get_destination_for_button(button_text)
    regexp = /<form(.*?)action="(.*?)"(.*?)>(.*?)<input(.*?)type="submit"(.*?)value="#{button_text}"(.*?)>/
    md = regexp.match(response.body)
    assert false, "No button found with button text:\"#{button_text}\"."  if md.nil?
    md[2]
  end

I thought that the regexp here should match input of this sort, but it fails.  I've tried making the 4th group greedy in case it was stopping on the first input field, but that still failed.

Try adding the multiline parameter to the regexp so the period seaches across newlines:

regexp = /<form(.*?)action="(.*?)"(.*?)>(.*?)<input(.*?)type="submit"(.*?)value="#{button_text}"(.*?)>/m

I think "m" is the right one.

Railscasts - Free Ruby on Rails Screencasts

Re: Help with some RegExp's

ryanb wrote:

I know you were having trouble with assert_select, is that the reason you aren't using that? Regular expressions aren't the best at searching HTML like this.

Well the asserts here are secondary to the rest of the behavoir... The more important part is pulling out one paramter of a tag based on another paramter.  (See my thread on Acceptance Testing)   (Hmm this might turn into a Rails question, but before it seemed pure Ruby....)


ryanb wrote:
NielsenE wrote:

regexp = /<input(.*?)id="#{md[2]}"(.*?)name="(.*?)"(.*?) \/>/

Line 5 (the input regexp) works, but relies on the fact that all the rails helpers output the id before the name.  Is there a good way to allow either order and still be able to pull out the name field?

You may want to split this into multiple regular expressions. Something like this:

response.body.scan(/<input.*?>/) do |input_tag|
  assert(input_tag =~ /\bname=".*?"/, "...")
  assert(input_tag =~ /\bid=".*?"/, "...")
end

In this case, I have the displayed text between a pair of "label"s; I need to get the name of the input.  So I first need to match find the label with the listed text, grab its "for" and then use that to find the input by id to pull out the name..... This name is used to construct the paramter hash for the form...  The asserts are only needed to protect later steps if the required label/input can't be found.....

My RoR journey  -- thoughts on learning RoR and lessons learned in applying TDD and agile practices.

Re: Help with some RegExp's

Oh, okay, I thought this was part of a view/functional test.

This might work:

def get_name_from_label(label)
  response.body.scan(/<label.*?for="(.*?)".*?>#{label}<\/label>/) do |input_id|
    response.body.scan(/<input[^>]+?id="#{input_id}".*?>/) do |input_tag|
      name = input_tag.scan(/\bname="(.*?)"/)
      return name.first unless name.empty?
    end
  end
  raise "No label found with \"#{label}\"."
end

There's probably a better way but that's all I can come up with at the moment.

Railscasts - Free Ruby on Rails Screencasts