Re: Parsing data with ruby

So,  you need to identify similar crontab entries amongst all.

i.e.

if server1.log has:

10,25,40,55 * * * * /usr/local/cpanel/whostmgr/bin/dnsqueue > /dev/null 2>&1

and server2.log has:

5,20,35,50 * * * * /usr/local/cpanel/whostmgr/bin/dnsqueue > /dev/null 2>&1

You have to determine that these are the same crontab entry, based on the fact that

/usr/local/cpanel/whostmgr/bin/dnsqueue > /dev/null 2>&1

is common between BOTH entries?

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

BradHodges wrote:

So,  you need to identify similar crontab entries amongst all.

i.e.

if server1.log has:

10,25,40,55 * * * * /usr/local/cpanel/whostmgr/bin/dnsqueue > /dev/null 2>&1

and server2.log has:

5,20,35,50 * * * * /usr/local/cpanel/whostmgr/bin/dnsqueue > /dev/null 2>&1

You have to determine that these are the same crontab entry, based on the fact that

/usr/local/cpanel/whostmgr/bin/dnsqueue > /dev/null 2>&1

is common between BOTH entries?

Yes, that's correct. smile

This part is really where I'm lost, as I'm not certain how I would go about pulling this off. I know we can grab the entire line, but how do we actually grab just the command being executed and then compile a sum of how many times said line occurs, while still retaining the previous form of the script where it gives the Y and N entries on times?

Re: Parsing data with ruby

OK,  you have to parse the line,  and say that everything after the fifth space is one thing.

Before you do that,  you have to iterate over all the files found in the root directory

cronentries = []  # create an array to hold all cron entries found
crontcount = [] # create an array to count how many entries
# these two arrays are in sync,  cronentries[0] contains a cron command
# croncount[0] contains the number of times that entry has been found
basedir = "/allcrontabs"
Dir.new(basedir).entries.each do |logfile|
  # in unix, you have to skip . and .. (curent dir and parent dir),  they'll be directories
  # so skip any file you find that is a directory
  # ANYTHING else found is assumed to be a crontab log
  unless File.directory? logfile  
    file = File.new(logfile, "r")
    while (line = file.gets)
      parts = line.split(' ')
      # you now have an array,  parts[0] through parts[4] are the time entries,  
      # parts[5] through parts[?] are the
      # cron command.  You don't know how many parts after part[5] there are,
      # it depends on how many spaces are in the command.  
      # so just merge all entries 5 through ? into one string
      cmd = parts[5,parts.length-5].join(' ')
      # this says, take elements of parts index 5 through the last index, and join them back into a string
      # seperated by a space, cmd now contains the cron command
      #
      # see if the command has been found before, add it if no,  increment counter if yes
      if cronentries.index(cmd) > 0
         croncount[cronentries.index(cmd)] += 1
      else
         cronentries << cmd
         croncount[cronentries.index(cmd)] = 1
      end
    end
    file.close
  end
end
# now let's see what we've found
cronentries.each do |c|
  puts "#{c} was found #{croncount[cronentries.index(c)]} times"
end

Last edited by BradHodges (2011-11-09 02:32:37)

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

Hmm, that's quite a big jump. Thank you for the explanations in the code, though, as that really helps.

Unfortunately, it's bugging out and I'm not quite sure what it wants. If I run the script in the same directory as the files I get this:

[/logs]# ruby cron_parse.rb
cron_parse.rb:26: undefined method `>' for nil:NilClass (NoMethodError)
    from cron_parse.rb:8:in `each'
    from cron_parse.rb:8

If I run it from / I get:

[/]# ruby cron_parse.rb
cron_parse.rb:13:in `initialize': No such file or directory - thoroughbred.log (Errno::ENOENT)
    from cron_parse.rb:13:in `new'
    from cron_parse.rb:13
    from cron_parse.rb:8:in `each'
    from cron_parse.rb:8

If i remove the file it complains about, it will just error out on another one. The only modification I've made is to change the basedir to

basedir = "/logs"

Last edited by Striketh (2011-11-09 03:38:28)

Re: Parsing data with ruby

To be expected, I'm just typing code, I'm not testing it :<

change:

     if cronentries.index(cmd) > 0

to:

     if cronentries.index(cmd)

Last edited by BradHodges (2011-11-09 10:48:53)

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

Here is a new and improved version, without comments,  but with the two previous programs combined into the final solution.

crons = []
counts = []
cparts = []
basedir = "/allcrontabs"
def YorN(part)
  if part == "*"
    "N"
  else
    "Y"
  end
end
Dir.new(basedir).entries.each do |logfile|
  unless File.directory? logfile  
    file = File.new(logfile, "r")
    while (line = file.gets)
      parts = line.split(' ')
      cmd = parts[5,parts.length-5].join(' ')
      idx = crons.index(cmd)
      if idx
         counts[idx] += 1
      else
         crons << cmd
         idx = crons.index(cmd)
         counts[idx] = 1
         cparts[idx] = parts[0,5] # an Array element containing another Array !
      end
    end
    file.close
  end
end
# OUTPUT results
puts "Cronjob  # Servers  Min  Hour  DOM  Month DOW"
crons.each do |c|
  idx = crons.index(c)
  puts "#{crons[idx]} #{counts[idx]} #{YorN(cparts[idx][0])} #{YorN(cparts[idx][1])} #{YorN(cparts[idx][2])} #{YorN(cparts[idx][3])} #{cparts[idx][4]} "
end

There is lots of room for improvement,  three separate arrays (crons, counts, cparts)  kept in sync with an external index (idx), is rather cheesy,  but easier to understand at first.

Last edited by BradHodges (2011-11-09 11:17:45)

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

BradHodges wrote:

Here is a new and improved version, without comments,  but with the two previous programs combined into the final solution.

crons = []
counts = []
cparts = []
basedir = "/allcrontabs"
def YorN(part)
  if part == "*"
    "N"
  else
    "Y"
  end
end
Dir.new(basedir).entries.each do |logfile|
  unless File.directory? logfile  
    file = File.new(logfile, "r")
    while (line = file.gets)
      parts = line.split(' ')
      cmd = parts[5,parts.length-5].join(' ')
      idx = crons.index(cmd)
      if idx
         counts[idx] += 1
      else
         crons << cmd
         idx = crons.index(cmd)
         counts[idx] = 1
         cparts[idx] = parts[0,5] # an Array element containing another Array !
      end
    end
    file.close
  end
end
# OUTPUT results
puts "Cronjob  # Servers  Min  Hour  DOM  Month DOW"
crons.each do |c|
  idx = crons.index(c)
  puts "#{crons[idx]} #{counts[idx]} #{YorN(cparts[idx][0])} #{YorN(cparts[idx][1])} #{YorN(cparts[idx][2])} #{YorN(cparts[idx][3])} #{cparts[idx][4]} "
end

There is lots of room for improvement,  three separate arrays (crons, counts, cparts)  kept in sync with an external index (idx), is rather cheesy,  but easier to understand at first.

Hey, thanks. I'm getting the following error now for this piece of code:

cmd = parts[5,parts.length-5].join(' ')
cron_parse.rb:19: undefined method `join' for nil:NilClass (NoMethodError)
  from cron_parse.rb:14

I've been trying to figure it out, but the error doesn't tell me a lot and I haven't been able to find an actual explanation of what that error specifically means, or why that wouldn't work. So, I'm in the dark as to what it actually wants to fix it.

Re: Parsing data with ruby

gotta drive son to school, back in 30 min

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

The method join is being called on nil,  which means

parts[5,parts.length-5]

is evaluating to nil,

Now we have to debug:

      if parts[5,parts,length-5]
         cmd = parts[5,parts.length-5].join(' ')
      else
         puts "Error on :#{line}"
      end
Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

BradHodges wrote:

The method join is being called on nil,  which means

parts[5,parts.length-5]

is evaluating to nil,

Now we have to debug:

      if parts[5,parts,length-5]
         cmd = parts[5,parts.length-5].join(' ')
      else
         puts "Error on :#{line}"
      end

Looks like the new code generates an error too. I tried defining it in the code, but if I define length then it gives an error for - in the same line, which probably means I'm doing it wrong tongue

cron_parse.rb:19: undefined local variable or method `length' for main:Object (NameError)
  from cron_parse.rb:14

And that's this piece of code, for reference:

if parts[5,parts,length-5]

Re: Parsing data with ruby

you put a comma instead of a period

if parts[5,parts.length-5]

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

BradHodges wrote:

you put a comma instead of a period

if parts[5,parts.length-5]

Ahh!

So, the results were rather crazy after changing that. It result in mostly gibberish and my SSH client freaked out. Here's how it looks:

Error on :#!/usr/bin/ruby
Error on :
Error on :crons = []
Error on :counts = []
Error on :cparts = []
Error on :basedir = "/home/rcraft/server_crons"
Error on :def YorN(part)
Error on :  if part == "*"
Error on :    "N"
Error on :  else
Error on :    "Y"
Error on :  end
Error on :end
Error on :Dir.new(basedir).entries.each do |logfile|
Error on :  unless File.directory? logfile
Error on :    file = File.new(logfile, "r")
Error on :    while (line = file.gets)
Error on :      parts = line.split(' ')
Error on :       if parts[5,parts.length-5]
Error on :         cmd = parts[5,parts.length-5].join(' ')
Error on :      else
Error on :         puts "Error on :#{line}"
Error on :      end
Error on :      idx = crons.index(cmd)
Error on :      if idx
Error on :         counts[idx] += 1
Error on :      else
Error on :         crons << cmd
Error on :         idx = crons.index(cmd)
Error on :         counts[idx] = 1
Error on :      end
Error on :    end
Error on :    file.close
Error on :  end
Error on :end
Error on :# OUTPUT results
Error on :crons.each do |c|
Error on :  idx = crons.index(c)
Error on :end
Error on :PK^C^D^T^@^@^@^H^@^E
Error on :i?ÛÖºþÄ^B^@^@y        ^@^@
Error on :^@^U^@adonis.logUT    ^@^Cº(ºNº(ºNUx^D^@P^BV^BÂ¥UÛnâ0^P}ÏWø<89>^G^T×¹n<91>VâW<90>c<9b>`á[m§möëw<92>@[H<80>vWH^Pâ9s;gÆy<85><8a>^BÂÇ^OñÖFÃ’Pvì\`^ºó^_¬,£êÃiµ9¹^Bg]ðdÄ^Qæ¨^Q<8a>¼^]l<88>ºõ¤<91><86>p^S^:Ñ<89>¹¯ºFÕÜ<87>¡%<99>r
Error on :ðܹÑ^Vm?°I<86>6÷QG©<94>ó<96>1kö3t½<94>÷W4¼964^\^V<90>^YZ<93>ú^^6ô<81>)AÍ<ç´Nó,Íë´ÈÒ¢NË,-ë´ÊÒªNk8|<98>^U§B[Ã^N<82>^]'ç«dMò/°<91>Qe)^?£ÞHÓ&çT^Z<84>2å<85>1<97><81>6JàÓ9¼ú<9c>?<ÂpL<95>ÂÆF¹<97><8c>FiMø^^+<97>±<9a>7%µ^D§\ìi§<96>ò]äçºà<96>õÀÑ^L]<£â\/ã÷DÍ-^[<9b>ÍZIöÔD^Z¢dö#^Bú½^@.=wÔÇ~<9c>^Hwp^H\<98>§áa6^UkR\6c<84>^H¯n^WÕ
><98>¨x\<82>^W̾¾^Qû°^\ëÒ&©~ÝQ×g^ÃŒ<81><84><86>1Z^TWöÌM^W\R^ñó8ý^H^K^NO^[îfú<9f>6^S-Ã…]   è>¼( Ô^GàSÃŒ<94>0¬õ^G^^8|qñ><83>^Vi½x17Ô<8c>^[ãô;ÃÊ[Fr^XV¯©^C³_^O;+Þ¥ÞEÉ{ÞÌ7Ã…^?;^Xõöï"^_.'MÍ^O<94>~<811            

And so on. Is it supposed to do that?

Re: Parsing data with ruby

Haha,  your program is in the same directory as the logs!!

OK, so we'll change the program to ONLY open file with a '.log' extension

crons = []
counts = []
cparts = []
basedir = "/allcrontabs"
def YorN(part)
  if part == "*"
    "N"
  else
    "Y"
  end
end
Dir.new(basedir).entries.each do |logfile|
  unless File.directory? logfile
    if logfile.split('.')[1] == 'log'
      file = File.new(logfile, "r")
      while (line = file.gets)
        parts = line.split(' ')
        if parts[5,parts.length-5]
          cmd = parts[5,parts.length-5].join(' ')
          idx = crons.index(cmd)
          if idx
             counts[idx] += 1
          else
            crons << cmd
            idx = crons.index(cmd)
            counts[idx] = 1
            cparts[idx] = parts[0,5] # an Array containing another Array !
        else
          puts "Error on: #{line} in file #{logfile}"
        end
      end
      file.close
    end
  end
end
# OUTPUT results
puts "Cronjob  # Servers  Min  Hour  DOM  Month DOW"
crons.each do |c|
  idx = crons.index(c)
  puts "#{crons[idx]} #{counts[idx]} #{YorN(cparts[idx][0])} #{YorN(cparts[idx][1])} #{YorN(cparts[idx][2])} #{YorN(cparts[idx][3])} #{cparts[idx][4]} "
end
Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

I thought I was supposed to run it from that directory D: Looks like we're getting closer. I removed the "puts "Error on:" line as it was generating an error about expecting $Kend and re-ran it with the following results.

$ ruby cron_parse.rb 
Cronjob  # Servers  Min  Hour  DOM  Month DOW
/usr/local/cpanel/scripts/upcp --cron 180 Y Y N N * 
/usr/local/cpanel/scripts/cpbackup 180 Y Y N N * 
/usr/bin/test -x /usr/local/cpanel/bin/tail-check && /usr/local/cpanel/bin/tail-check 180 Y N N N * 
/usr/local/cpanel/bin/mysqluserstore >/dev/null 2>&1 180 Y Y N N * 

I snipped some off as it was pretty long.

Last edited by Striketh (2011-11-09 13:04:36)

Re: Parsing data with ruby

Hmmm,  all the counts are 180,  does that seem right?

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

I fixed the problem with the puts "Error...

And I changed the format of the output, easier to read I think.

I also added the YorN on the last time element, it was missing

crons = []
counts = []
cparts = []
basedir = "/allcrontabs"
def YorN(part)
  if part == "*"
    "N"
  else
    "Y"
  end
end
Dir.new(basedir).entries.each do |logfile|
  unless File.directory? logfile
    if logfile.split('.')[1] == 'log'
      file = File.new(logfile, "r")
      while (line = file.gets)
        parts = line.split(' ')
        if parts[5,parts.length-5]
          cmd = parts[5,parts.length-5].join(' ')
          idx = crons.index(cmd)
          if idx
             counts[idx] += 1
          else
            crons << cmd
            idx = crons.index(cmd)
            counts[idx] = 1
            cparts[idx] = parts[0,5] # an Array containing another Array !
          end
        else
          puts "Error on: #{line} in file #{logfile}"
        end
      end
      file.close
    end
  end
end
# OUTPUT results
puts "# Servers  Min  Hour  DOM  Month DOW  Cronjob"
crons.each do |c|
  idx = crons.index(c)
  puts "#{counts[idx]} #{YorN(cparts[idx][0])} #{YorN(cparts[idx][1])} #{YorN(cparts[idx][2])} #{YorN(cparts[idx][3])} #{YorN(cparts[idx][4])} #{crons[idx]}"
end

Last edited by BradHodges (2011-11-09 13:20:27)

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

BradHodges wrote:

I fixed the problem with the puts "Error...

And I changed the format of the output, easier to read I think.

I also added the YorN on the last time element, it was missing

crons = []
counts = []
cparts = []
basedir = "/allcrontabs"
def YorN(part)
  if part == "*"
    "N"
  else
    "Y"
  end
end
Dir.new(basedir).entries.each do |logfile|
  unless File.directory? logfile
    if logfile.split('.')[1] == 'log'
      file = File.new(logfile, "r")
      while (line = file.gets)
        parts = line.split(' ')
        if parts[5,parts.length-5]
          cmd = parts[5,parts.length-5].join(' ')
          idx = crons.index(cmd)
          if idx
             counts[idx] += 1
          else
            crons << cmd
            idx = crons.index(cmd)
            counts[idx] = 1
            cparts[idx] = parts[0,5] # an Array containing another Array !
          end
        else
          puts "Error on: #{line} in file #{logfile}"
        end
      end
      file.close
    end
  end
end
# OUTPUT results
puts "# Servers  Min  Hour  DOM  Month DOW  Cronjob"
crons.each do |c|
  idx = crons.index(c)
  puts "#{counts[idx]} #{YorN(cparts[idx][0])} #{YorN(cparts[idx][1])} #{YorN(cparts[idx][2])} #{YorN(cparts[idx][3])} #{YorN(cparts[idx][4])} #{crons[ids]}"
end

Ahh, that's much easier to read. It looks like the count is correct, by what I can see as well (there's 189 log files in total). One shows 334 for the count, but that's because that particular cron job is duplicated in a lot of servers somehow so there will be 2 entries in the cron tab.

Output looks like this:

$ ruby cron_parse.rb 
# Servers  Min  Hour  DOM  Month DOW  Cronjob
4 N N N N N /root/staydown.sh &>/dev/null
180 Y Y N N N /usr/local/cpanel/scripts/upcp --cron
180 Y Y N N N /usr/local/cpanel/scripts/cpbackup
180 Y N N N N /usr/bin/test -x /usr/local/cpanel/bin/tail-check && /usr/local/cpanel/bin/tail-check

Also, there was a small typo in the 2nd to last line.

#{crons[ids]

Should have been:

#{crons[idx]

So, hey, at least I can find typos big_smile

This looks awesome, though. I've really learned a lot doing this - A LOT more than I would have by just reading a book about ruby.

My goal is to code something entirely by myself now (not as elaborate as this, though, to start) and go from there! So far I'm really liking ruby and how, for the most part, easy it is to understand.

For example, the "def YorN(part)" line is pretty straight forward. Even if I hadn't the slightest knowledge of programming, I could ascertain what that meant.

And then "Dir.new" and ""unless File.directory? logfile" are both very straight forward.

I really can't thank you enough for all your help in this. I feel like I should be paying you for the ruby lesson tongue

Re: Parsing data with ruby

Great.

No thanks needed,  doing this keeps me involved, I've been programming for 31 years, and enjoy it.  I don't have anything to program right now,  so this is my way of finding something to code!

If you want to learn more,  I'd start by improving on the YorN function,  make it parse the time element and figure out more about what the element is telling cron, i.e.

# servers     Minutes           Hours            Day of Month          Month         Day of Week     CronJob
180           Every 10 minutes  N                N                     N             N               whatever
180           N                 10 AM and 5 PM   N                     N             N               whatever

Last edited by BradHodges (2011-11-09 13:34:58)

Joe got a job, on the day shift, at the Utility Muffin Research Kitchen, arrogantly twisting the sterile canvas snout of a fully charged icing anointment utensil.

Re: Parsing data with ruby

BradHodges wrote:

Great.

No thanks needed,  doing this keeps me involved, I've been programming for 31 years, and enjoy it.  I don't have anything to program right now,  so this is my way of finding something to code!

If you want to learn more,  I'd start by improving on the YorN function,  make it parse the time element and figure out more about what the element is telling cron, i.e.

# servers     Minutes           Hours            Day of Month          Month         Day of Week     CronJob
180           Every 10 minutes  N                N                     N             N               whatever
180           N                 10 AM and 5 PM   N                     N             N               whatever

Wow, 31 years! No wonder this comes as second nature to you big_smile

I've struggled trying to learn PHP/Perl in the past and just couldn't get them, but then I latched onto ruby and it's been something that I've become genuinely interested in learning ever since. So, you should be seeing me around this part of the forums fairly often smile

I'm going to keep hitting the books and once I feel confident enough to come back to this script, I'll continue to work on it and let you know how it turns out!