Bash: awk to extract Nth match from file based on line separator

If you need to extract the Nth occurrence of a match in a file (given definitive block separators), awk provides a convenient way to express the extraction.

For example, a chained pem certificate will have multiple certification definitions with unique starting and ending marker lines. Here is how you would extract the second certificate.

awk '/-----BEGIN CERTIFICATE-----/&&++k==2,/-----END CERTIFICATE-----/' my-chained.pem

As another example, in a multi-document yaml file, three dashes isolated on a single line indicate an independent document. Because the beginning and ending markers are the same, if you want to isolate the second match use the command below.

# second match isolated
awk '/---/&&++k==2,/FOOBAR/' my-multidoc.yaml | tail -n+2 | awk '//&&++k==1,/---/'

‘FOOBAR’ above does not hold any special meaning, it is simply line content that is unlikely to be found (and therefore captures all the way to the end of file).

Example Script

I’ve posted an example script on github, awk_nth_match.sh

wget https://raw.githubusercontent.com/fabianlee/blogcode/master/bash/awk_nth_match.sh
chmod +x awk_nth_match.sh

# illustrate use of awk to pull Nth occurence
./awk_nth_match.sh

REFERENCES

yaml,org, multi-document yaml

github fabianlee, example using awk to pull nth match