Introducing Off-Peak Computing

July 24, 2024

3 min

EXXA team

We are excited to announce the launch of EXXA Off-Peak Computing inference service starting with the impressive Llama 3.1 70B model by Meta. At just $0.34 per million tokens, we offer the lowest price in the market for batch processing, combined with the most sustainable approach in the industry.

We believe that using Generative AI technology the right way should be extremely easy and affordable. Focusing first on tasks that do not require instantaneous responses, we are able to deliver impressive results both financially and environmentally.

A concrete solution to reduce Gen-AI CO2 emissions

The environmental impact of generative AI is indeed massive, potentially increasing the share of global CO2 emissions from data centers from 2% to 4% by 2030 and therefore significantly increasing the pressure on power grids.

EXXA addresses this challenge with "Off-Peak Computing", an innovative inference API that reduces the carbon footprint of generative AI processing by:

Shifting processing to off-peak hours, such as night time
Prioritizing low-emission locations for processing
Optimizing GPU usage to maximize efficiency per unit of resource

EXXA technical innovations also translate financially

Our mission is to provide the most efficient Gen-AI processing. We have developed proprietary solutions that enhance flexibility and efficiency, making EXXA the most cost-effective choice in the market.

At just $0.34 per million tokens for Llama 3.1 70b, EXXA offers the cheapest batch processing service available.

We make key use cases more affordable

EXXA LLM inference is ideal for many applications that benefit from large and powerful language models without needing instantaneous responses. Key use cases include:

Evaluation: Use Llama 3.1-70B as a judge to assess smaller models, such as in RAG applications.
Translation: Efficiently translate large volumes of text across multiple languages.
Synthesis: Summarize customer interactions or internal chatbot conversations daily to minimize storage needs.
Classification: Organize extensive datasets, including documents, customer feedback, or news articles, on a daily basis.

Starting with Llama 3.1 70b Instruct

We are happy to launch EXXA Off-Peak Computing with Llama 3.1-70b-Instruct by Meta, featuring:

24-hour processing delay
No hard rate limits,
Most competitive price

A concrete solution to reduce Gen-AI CO2 emissions

EXXA technical innovations also translate financially

We make key use cases more affordable

Starting with Llama 3.1 70b Instruct

Start using it today!