TensorFlow reposted this
📣 🧠 Exciting news for researchers pushing the boundaries of efficient deep learning! We've scaled RecurrentGemma to 9 billion parameters! Dive into the details and access the model on: 📘 Kaggle → https://goo.gle/3xd1IYs 🤗 Hugging Face → https://goo.gle/3KDkknt 🚀 This new model achieves performance comparable to the largest Gemma 1 model, but with significantly greater efficiency. That means lower memory requirements and faster sampling speeds, especially for long sequences or large batch sizes. For example, on a single TPU-v4, it delivers 80x higher throughput when sampling 1k tokens from a 2k token prompt.
Fantastic news! We are ready to play with the latest Gemma model on Kaggle!
Looks great!
Congratulations on scaling RecurrentGemma to 9 billion parameters! This achievement is a testament to your team's dedication and expertise in pushing the boundaries of efficient deep learning. The improved efficiency, lower memory requirements, and faster sampling speeds will undoubtedly benefit researchers in their work. Keep innovating and inspiring!
Absolutely bang on!
Something have to change in it
This is very great And important. 🧬✔️
Fantastic News !!
Lawyer Turned Developer
1moI was trying to train a PokéDex Neural Network. I succeeded in building Pokémon MNIST by scraping free databases and combining them with the Pokémon API. But just when I got to a 27% success rate in classifying Pokémon types from pictures, Google for Developers cut off my free GPU! 🤯 🔴 If someone wants to catch 'em all with "9B" parameters, check out my Pokémon-Net prototype on Google Colab: https://colab.research.google.com/drive/1wq32CSDF7wWEK08Eq2T1WacPB-JDrOtY?usp=sharing