I wonder whether the following is possible:

echo -e "0@1 1@1 0@0\n0@0 1@1 0@1" | awk '{print gensub(/([01])@([01])/, "\\1" + "\\2", "g")}'

It doesn't work the way it is; is that because the evaluation of "+" happens before the substitutions of "\1" and "\2"?

As output, I would expect 1, the result of arithmetic on \1 and \2, so for \1=0 and \2=1, the output should be 1.

Also, as per answer below, I am not looking for a solution on how to add 1 and 0 in "1@0"; this is just an example, I just wondered whether it is possible to do arithmetic on \1 and \2, since this works: gensub(/blah blah/, 0 + 1, "g") gives 1.

upvote
  flag
better to advise what is the expected output. Is it 0+1 as a string, is it 0+1=1 as a number....? – George Vasiliou
upvote
  flag
It will be good if you could show us the output which you are looking for, please add the expected output in code tags in your post. – RavinderSingh13

3 Answers 11

You can't use gensub() for this, because it returns the captured groups as literal strings as its result.

For such a trivial requirement use @ as the field separator and do the arithmetic computation as

echo "0@1" | awk -F@ '{print ($1 + $2)}'

Or if you are worried about string values in the input string, force the numeric conversion using int() casting, or just add +0 to each of the operands, i.e. use (int($1) + int($2)) or (($1+0) + ($2+0))

As per the updated question/comments in the answer below, doing constant numeric arithmetic is not something gensub() is intended for, which is supposed to do a regexp based pattern search and replacement. The replacement part on most cases involves dealing with the captured groups from the search string and apply some modifications over it.

up vote 1 down vote accepted

I think I understand what you want, and you can do it in Perl using the e modifier on a substitution which means it evaluates the replacement. Here's an example:

echo "7@302" | perl -nle 's/(\d+)@(\d+)/$1+$2/e && print'
309

Or, slightly more fun:

echo "The 200@109 cats sat on the 7@302 mats" | perl -nle 's/(\d+)@(\d+)/$1+$2/ge && print'
The 309 cats sat on the 309 mats
upvote
  flag
I don't know any perl, but I thought anything perl can do, awk can do as well? No? – A. Blizzard
upvote
  flag
Nice! Can't up vote no privilege. – A. Blizzard
upvote
  flag
I may get shot down in flames for this, but my perceived order of capability going from lowest to highest is: sed, awk, Perl. – Mark Setchell
upvote
  flag
@A.Blizzard Mark is correct. awk is a language for manipulating text that is all. perl can also manipulate text but additionally it can do the things you use other tools/languages for, e.g. manipulating files and processes like you'd use a shell for. The result is that awk is a very small, simple tool/language that does one thing and does it well while perl is something quite different (see zoitz.com/archives/13 :-) ). – Ed Morton

When you write foo(bar()), you'll find that bar() is executed first whether it's a function or any expression so gensub(..., "\\1" + "\\2", ...) calls gensub() using the result of adding the 2 strings which is 0, i.e. gensub(..., 0, ...).

This isn't semantically identical to the code you wrote but the approach to do what you want is to use the 3rd arg to match():

$ echo "0@1" | awk 'match($0,/([01])@([01])/,a){print a[1] + a[2]}'
1

The above uses GNU awk for that 3rd arg to match() but you were already using that for gensub() anyway. If it's not clear how to use that on your real data then post a followup question that includes an example of your real data.

upvote
  flag
Thanks! This looks like it will get pretty cumbersome for more than one set of "0@1" in $0. The above Perl version with "ge" looks very neat though. – A. Blizzard
upvote
  flag
You're welcome. Why would you think it'd get cumbersome though? – Ed Morton
upvote
  flag
I've never really used match, I know awk only superficially for basic things; say it was "0@1 1@1 0@0", wouldn't I need to write at least a loop over a? – A. Blizzard
upvote
  flag
$ echo "0@1 1@1 0@0" | awk -v RS='\\s+' 'match($0,/([01])@([01])/,a){print a[1] + a[2]}' outputs 1 2 0. Is that your desired output? If you want to know how to do whatever it is you're trying to do in awk and/or perl then take a few mins to figure out some truly representative sample input and expected output representing your real data and post a question using that so we can best help you. Going one brief snippet of isolated text at a time isn't useful. – Ed Morton
upvote
  flag
Yes, I really should not have been lazy and put up a proper example. I have modified the sample code in the question. The numbers should replace the "\d@\d" strings. – A. Blizzard
upvote
  flag
You didn't provide the exact expected output though and you've already accepted an answer so the number of people who'll look at your question now is limited which is why I suggested posting a new question. This time around provide sample input as a file and the associated output also consider if your input is really always just space-separated number-@-number pairs - what if those were embedded in test that included email addresses like bill0@1way.com. Also is it really just 1s and 0s? Just come up with realistic input and the associated output. – Ed Morton
upvote
  flag
Mark Setchell's answer (the second line with "ge" option) really does the exact job! I won't post another question as I am sure there are other better problems to look at. It was just something that occurred to me yesterday late night as I was converting genotype data of the form "0|1", "1|0", "0|0", "1|1" into counts 1, 1, 0, 2, I always used either condition statements or three gsub statements in awk. As I was trying to read up more on awk I thought the gensub could be a neat solution; although, I am sort of aware that it probably would have been slower than the three gsubs. – A. Blizzard
upvote
  flag
The files are about 12 million rows and 250 columns, the above perl solution ploughed through it fairly fast, and I am happy with it. I modified the separators etc. – A. Blizzard
upvote
  flag
That's fine, if you ever do want to see how to do the job in awk, just post a question. – Ed Morton

Not the answer you're looking for? Browse other questions tagged or ask your own question.