Bash: Using BASH_REMATCH to pull capture groups from a regex

The =~ binary operator provides the ability to compare a string to a POSIX extended regular expression in the shell.

Take note, the right-hand side regex cannot be surrounded by quotes or it will be treated as a regular string, it cannot contain spaces, and must conform to POSIX regex rules and use character classes such as [:space:] instead of “\s”.  A simple example would be:

if [[ "The quick brown fox" =~ ^The.*(fox)$ ]]; then 
  echo "The animal is a ${BASH_REMATCH[1]}"
fi

The animal is a fox

A more complex sample with character classes [:space:] and [:alpha:], notice that you must use double brackets around them to conform to bracket expression.

if [[ "The quick brown fox" =~ ^The[[:space:]]([[:alpha:]]*)[[:space:]].*fox$ ]]; then 
  echo "The second word in the sentence was '${BASH_REMATCH[1]}'"
fi

The second word in the sentence was 'quick'

Another example is pulling cache sizes out of /proc/cpuinfo with multiple capture groups and character classes.

IFS='\n'
cat /proc/cpuinfo | while read line ; do 
  if [[ "$line" =~ ^cache[[:blank:]]size[[:blank:]]*:[[:blank:]]([[:digit:]]*)[[:space:]]([[:alpha:]]*) ]]; then 
    echo cache set to ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}
  fi
done

cache set to 16384 KB
cache set to 16384 KB
cache set to 16384 KB
cache set to 16384 KB

REFERENCES

gnu, bash manual

stackoverflow, why does BASH_REMATCH not work for quoted regex

wikipedia, POSIX extended regular expression

stackoverflow, regex matching in a Bash if statement

riptutorial, BASH_REMATCH