Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more
In today’s technical blog post, we will divulge some fascinating insightful commentary on the recently released two updated production-ready Gemini models: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002 along with new updates to both of them that have been built on Google AI Studio and the Gemini API. The updated Gemini 1.5 series are designed for general performance across a wide range of text, code, and multimodal tasks. For larger organizations and Google Cloud customers, they are also available on Vertex AI.
The latest updates have seen an improvement in MMULU-Pro, a more chaotic version of the popular MMLU benchmark. On MATH and HiDDenMath (an internal holdout set of competition math problems) benchmarks, both models have made a considerable 20% improvement, with both models performing better than before (raning from ~2-7%). This has been attributed to improving content safety policies and standards.
With this improvement in MMULU-Pro, the overall helpfulness of model responses is also expected to be improved. The new models have a more concise style in response to developer feedback which is intended to make these models easier to use and reduce costs. For product use cases like summarization, question answering, and extraction, the default output length has been reduced from 128K tokens to less than 52%. This has improved downstream production pipeline’s cost savings while still maintaining high quality results for these use cases.
For chat-based products where users might prefer longer responses by default, developers can read our prompting strategies guide to learn more about how to make the models more verbose and conversational. While we have seen some new use cases being built with Gemini 1.5 Pro and Gemini 1.5 Flash that have seen notable improvements in output tokens per second, we are seeing an even greater increase in the paid tier rate limits for these models to come, up from 128K to 1,000 RPM for 1.5 Flash and 360 for 1.5 Pro, respectively.
With these improvements, developers can now build new use cases with our most powerful models more efficiently while also being cost-effective due to the latest pricing updates. In the upcoming weeks, we will increase the paid tier rate limits to drive even more development and build new use cases. We anticipate that this will lead to an even greater interest in building with these most powerful Gemini models for new applications within Google’s products.
Overall, we believe that this new set of updates has improved the performance and helpfulness of our models, making them more accessible to a wider range of developers. We are excited about these updated Gemini models and eagerly await seeing what you can build with them!