Operational Strategies

How to operate without losing money?

This article discusses a common question asked by many users: I have some user resources and want to deploy a service to provide external services. How should I set the price for computing power? Or, in other words, how can I set the price to avoid losing money?

Here, I will provide a simple estimation method for your reference. It may not be entirely accurate, but it ensures the error is within a controllable range.

1. Computing Power Cost

First, you need to calculate your computing power cost. We will use a relay station as an example (https://api.aiggr.com), referred to as Relay A.

The exchange rate for Relay A is 3:1, meaning 3 RMB can buy 1 USD worth of computing power (if you use OpenAI's official API directly, the exchange rate is 7.4:1).

Relay A's multiplier is 1, meaning the computing power price is 1:1. Typically, relay multipliers are higher than the official ones, which can usually be seen in the price documentation provided by the relay.

Let's take the gpt-4o model as an example. Below is the official price table:

gpt-4o Price Table

Below is Relay A's price table:

Relay A Price Table

As you can see, Relay A's computing power price is indeed the same as the official one, as $5.00 / 1M tokens and $0.005 / 1K tokens are equivalent.

So, if you purchase computing power from Relay A, your cost per 1K tokens is:

# Input cost
0.005 * 3 = 0.015 RMB
# Output cost
0.015 * 3 = 0.045 RMB

2. Conversation Computing Power

Again, we use the gpt-4o model as an example. This model has a maximum context length of 128K (see above). Since our conversations carry context, the context length will increase with each turn, and the input length will far exceed the output length. Assuming the input accounts for 70% and the output for 30%, the maximum cost per conversation for the gpt-4o model is:

## Input cost
0.015 * 128 * 0.7 = 1.34 RMB
## Output cost
0.045 * 128 * 0.3 = 1.72 RMB
## Total cost
1.34 + 1.72 = 3.06 RMB

This is the most extreme case, meaning the user fully utilizes the model's context every time. In reality, the cost will be much lower because conversations with AI are typically question-and-answer exchanges unless you dump a 10,000-word essay for analysis. So, we can roughly estimate the median cost per gpt-4o conversation to be around 1.5 RMB.

In GeekAI, if you sell 100 units of computing power for 10 RMB, the computing power consumed per gpt-4o conversation should be:

100/(10/1.5) = 15 units

This cost may seem high, but it's necessary to ensure you don't lose money. If you think users will find this price hard to accept, you can further reduce costs by adjusting the maximum context length for the gpt-4o model in the GeekAI admin panel.

Set Model Max Context Length

If you set the maximum context length to 8K, the cost drops by 16 times, and the maximum cost per conversation becomes 0.1 RMB.

## Input cost
0.015 * 8 * 0.7 = 1.34 RMB
## Output cost
0.045 * 8 * 0.3 = 1.72 RMB

In this case, you can set the gpt-4o model's computing power to 1 to break even, or set it to 2 and lower the price, e.g., 10 RMB for 200 units of computing power.

Similarly, you can estimate the computing power cost for other models. However, it's recommended to use gpt-3.5 as the baseline (1 unit) and adjust other models accordingly, as it's the cheapest. Otherwise, if gpt-4o is set to 1 unit, gpt-3.5 would be a fraction, which is not allowed. According to OpenAI's official price table, gpt-4o is 10 times more expensive than gpt-3.5. If gpt-3.5 is set to 1 unit, gpt-4o should be 10 units, and so on for other models.

3. Drawing Computing Power

The calculation for drawing computing power is similar to conversations. Let's take MJ as an example. The cost per MJ call is $0.145.

MJ Price Table

If you set the price per unit of computing power in GeekAI to 0.1 RMB, the computing power consumed per MJ drawing should be:

0.145 * 3 / 0.1 = 4.34 units

So, setting it to 5 units ensures you break even. If you want a 20% profit, set it to 6 units, and so on.