Mac: LLama2 model on Apple Silicon and GPU using llama.cpp
It is relatively easy to experiment with a base LLama2 model on M family Apple Silicon, thanks to llama.cpp written by Georgi Gerganov. The llama.cpp project provides a C++ implementation for running LLama2 models, and takes advantage of the Apple integrated GPU to offer a performant experience (see M family performance specs).