- Published on
Gemma 2 is a new generation of open language models available in 2B, 9B and 27B parameter sizes. The models are built on a redesigned architecture optimised for inference efficiency.
The 27B variant can run on a single NVIDIA H100, A100 80GB GPU, or Google Cloud TPU host at full precision. The 9B model outperforms comparable models like Llama 3 8B in its size class.
Gemma 2 is available under the commercially-friendly Gemma license.
Technical specifications include compatibility with major AI frameworks including Hugging Face Transformers, JAX, PyTorch, and TensorFlow via Keras 3.0. The models support vLLM, Gemma.cpp, Llama.cpp and Ollama implementations, with NVIDIA TensorRT-LLM optimization.
The models are available through Google AI Studio, Kaggle, and Hugging Face, and Vertex AI.