How to measure the credits used in each call to the Parse, Extract, and Classify services? #20260
Replies: 2 comments
-
|
Hi @PedroRodriguezx! I'm Dosu and I’m helping the llama_index team. There isn’t currently a built-in way to measure credits used per call for all three services. For Parse, the API exposes The recommended approach is to implement your own tracking by accessing raw LLM responses (where available) and extracting token usage or cost information. This requires custom logic and may not be possible for every service or provider, since reporting varies widely (source). LlamaIndex provides a callback/event system that you can extend for observability, but it doesn’t natively track credits for these services. If you need per-call credit tracking, you’ll need to rely on upstream API responses (if available) or instrument your workflow to capture token usage at the LLM level. There’s an open feature request to improve this for Parse, but it depends on changes in the upstream To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
|
Hi @PedroRodriguezx, You are correct regarding the API limitations, reliance on the API response metadata ( Since LlamaCloud billing is strictly usage-based per page, the best practice is to calculate the credits client-side. Because the Here is the breakdown of how to calculate this programmatically: 1. The FormulaThe total cost is deterministic based on the number of pages and the specific mode you selected. 2. Service Rates (Estimates)You should map these to the specific configuration strings you are passing in your code.
3. Implementation StrategySince # Python Logic Example
def log_credit_usage(parse_result, parse_mode, extract_mode):
# 1. Get accurate page count from the Parse result
# (The list of pages is the source of truth)
num_pages = len(parse_result.get("pages", []))
# 2. Define Rates (Update these based on your specific model config)
# Example rates:
parse_rate = 3 if parse_mode == "fast" else 15
extract_rate = 20 if extract_mode == "multimodal" else 5
classify_rate = 0 # Currently free
# 3. Calculate
total_parse_credits = num_pages * parse_rate
total_extract_credits = num_pages * extract_rate
total_job_credits = total_parse_credits + total_extract_credits + classify_rate
print(f"Document Pages: {num_pages}")
print(f"Total Credits Consumed: {total_job_credits}")
return total_job_creditsBy using If you find this breakdown helpful for your debugging, please mark this response as the answer so others with similar credit-tracking needs can find it easily! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Currently, I have a workflow for extracting data from documents that involves three LlamaIndex services: Parse, Classify, and Extract. For each document, I end up running jobs across these three services. For debugging and to follow best practices, I need a way to measure how many credits are spent on each call made to these services. I tested using the API to look up previously executed jobs, but it seems that only for the Parse service I can retrieve this data. However, those fields are deprecated and will soon be removed from the API output. Extract does not provide credit consumption information, and Classify, being in beta, is not available yet. I need a way to access this information.
Beta Was this translation helpful? Give feedback.
All reactions