Bash: Difference between two arrays

Whether looking at differences in filenames, installed packages, etc. it can be useful to calculate the difference between two Bash arrays.

SiegeX on stackoverflow.com offered the following function using awk, and I have built a full example available on github.

function arraydiff() {
  awk 'BEGIN{RS=ORS=" "}
       {NR==FNR?a[$0]++:a[$0]--}
       END{for(k in a)if(a[k])print k}' <(echo -n "${!1}") <(echo -n "${!2}")
}

With simple usage like below:

animals=( "fox" "tiger" "alligator" "snake" "bear" )
mammals=( "fox" "tiger" "bear" )
nonmammals=($(arraydiff animals[@] mammals[@]))

echo "total set of animals:" ${animals[@]}
echo "mammals:" ${mammals[@]}
echo "non-mammals:" ${nonmammals[@]}

A more advanced example would be providing a list of Ubuntu OS package names and using dpkg to determine which are installed, and then using the difference to calculate which are not installed.

Full script is available in github.

function show_ubuntu_package_not_installed() {
  candidatepkgs="$1"
  echo "candidate package list: $candidatepkgs"

  # use dpgk to query for packages already installed
  alreadyinstalled=$(dpkg --get-selections $candidatepkgs 2>&1 | grep 'install$' | awk '{ print $1 }' | tr "\n" " ")
  echo "packages already installed: $alreadyinstalled"

  IFS=' ' read -r -a candidateArray <<< $candidatepkgs
  IFS=' ' read -r -a installedArray <<< $alreadyinstalled

  notInstalledArray=($(arraydiff candidateArray[@] installedArray[@]))
  echo "packages not yet installed:" ${notInstalledArray[@]}
}

 

REFERENCES

stackoverflow compare diff of two arrays, SiegeX