-
Notifications
You must be signed in to change notification settings - Fork 662
Open
Description
fastdeploy 部署模型报错,以下是启动命令:
export CUDA_VISIBLE_DEVICES=0
python -m fastdeploy.entrypoints.openai.api_server \
--model PaddlePaddle/ERNIE-4.5-0.3B-Paddle \
--port 8180 \
--metrics-port 8181 \
--engine-worker-queue-port 8182 \
--tensor-parallel-size 1 \
--max-model-len 1024 \
--max-num-seqs 80 \
--enable-prefix-caching \
--swap-space 10
以下是 workerlog.0 完整日志:
which: no ccache in (/root/miniconda3/envs/fastdeploy/bin:/root/miniconda3/condabin:/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin)
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/utils/cpp_extension/extension_utils.py:718: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md
warnings.warn(warning_message)
[2025-12-04 18:18:17,261] [ INFO] distributed_strategy.py:335 - distributed strategy initialized
======================= Modified FLAGS detected =======================
FLAGS(name='FLAGS_cudnn_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cudnn/lib', default_value='')
FLAGS(name='FLAGS_enable_pir_in_executor', current_value=True, default_value=False)
FLAGS(name='FLAGS_pir_interpreter_record_stream_for_gc_cache', current_value=True, default_value=False)
FLAGS(name='FLAGS_nvidia_package_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia', default_value='')
FLAGS(name='FLAGS_cusparse_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cusparse/lib', default_value='')
FLAGS(name='FLAGS_selected_gpus', current_value='0', default_value='')
FLAGS(name='FLAGS_nccl_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/nccl/lib', default_value='')
FLAGS(name='FLAGS_parameters_persistent_mode_in_dy2st', current_value=True, default_value=False)
FLAGS(name='FLAGS_cublas_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cublas/lib', default_value='')
FLAGS(name='FLAGS_specialize_device_in_dy2st', current_value=True, default_value=False)
FLAGS(name='FLAGS_cusolver_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cusolver/lib', default_value='')
FLAGS(name='FLAGS_cupti_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cuda_cupti/lib', default_value='')
FLAGS(name='FLAGS_curand_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/curand/lib', default_value='')
FLAGS(name='FLAGS_cuda_cccl_dir', current_value='/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/../nvidia/cuda_cccl/include/', default_value='')
=======================================================================
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/distributed/parallel.py:1062: UserWarning: Currently not a parallel execution environment, `paddle.distributed.init_parallel_env` will not do anything.
warnings.warn(
[2025-12-04 18:18:17,264] [ INFO] topology.py:526 - Total 1 pipe comm group(s) create successfully!
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/distributed/communication/group.py:145: UserWarning: Current global rank 0 is not in group _default_pg10
warnings.warn(
[2025-12-04 18:18:17,623] [ INFO] topology.py:526 - Total 1 data comm group(s) create successfully!
[2025-12-04 18:18:17,624] [ INFO] topology.py:526 - Total 1 model comm group(s) create successfully!
[2025-12-04 18:18:17,624] [ INFO] topology.py:526 - Total 1 sharding comm group(s) create successfully!
[2025-12-04 18:18:17,624] [ INFO] topology.py:440 - HybridParallelInfo: rank_id: 0, mp_degree: 1, sharding_degree: 1, pp_degree: 1, dp_degree: 1, sep_degree: 1, mp_group: [0], sharding_group: [0], pp_group: [0], dp_group: [0], sep:group: None, check/clip group: [0]
[2025-12-04 18:18:17,624] [ INFO] - Using download source: huggingface
[2025-12-04 18:18:17,624] [ INFO] - Loading configuration file PaddlePaddle/ERNIE-4.5-0.3B-Paddle/config.json
[2025-12-04 18:18:17,624] [ WARNING] - You are using a model of type ernie4_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/graph_optimization/utils.py:21: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
[2025-12-04 18:18:18,667] [ WARNING] - import w4afp8_gemm_scale_permute Failed!
[2025-12-04 18:18:20,210] [ INFO] - Enabled logits processors: []
INFO 2025-12-04 18:18:20,270 297823 cuda.py[line:59] Using APPEND ATTN backend.
[2025-12-04 18:18:20,270] [ INFO] - queue id is 8182
[2025-12-04 18:18:20,270] [ INFO] - Starting to load model Ernie4_5_ForCausalLM
[2025-12-04 18:18:20,272] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,274] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,276] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,278] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,279] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,281] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,283] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,284] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,286] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,288] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,289] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,291] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,293] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,294] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,296] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,298] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,299] [ INFO] - Attention is running in cache kv bfloat16 mode
[2025-12-04 18:18:20,301] [ INFO] - Attention is running in cache kv bfloat16 mode
Loading safetensors checkpoint shards: 100%|██████████| 1/1 [00:00<00:00, 2.10it/s]
[2025-12-04 18:18:20,783] [ INFO] - Model loading took 0.479 seconds
[2025-12-04 18:18:20,783] [ INFO] - Skip saving ,cache disabled
[2025-12-04 18:18:20,785] [ INFO] - Initializing kv cache for all layers. [0]
[2025-12-04 18:18:20,785] [ INFO] - ..creating kv cache for layer 0: (150, 2, 64, 128)
[2025-12-04 18:18:20,786] [ INFO] - ..creating kv cache for layer 1: (150, 2, 64, 128)
[2025-12-04 18:18:20,788] [ INFO] - ..creating kv cache for layer 2: (150, 2, 64, 128)
[2025-12-04 18:18:20,789] [ INFO] - ..creating kv cache for layer 3: (150, 2, 64, 128)
[2025-12-04 18:18:20,790] [ INFO] - ..creating kv cache for layer 4: (150, 2, 64, 128)
[2025-12-04 18:18:20,792] [ INFO] - ..creating kv cache for layer 5: (150, 2, 64, 128)
[2025-12-04 18:18:20,793] [ INFO] - ..creating kv cache for layer 6: (150, 2, 64, 128)
[2025-12-04 18:18:20,795] [ INFO] - ..creating kv cache for layer 7: (150, 2, 64, 128)
[2025-12-04 18:18:20,796] [ INFO] - ..creating kv cache for layer 8: (150, 2, 64, 128)
[2025-12-04 18:18:20,797] [ INFO] - ..creating kv cache for layer 9: (150, 2, 64, 128)
[2025-12-04 18:18:20,799] [ INFO] - ..creating kv cache for layer 10: (150, 2, 64, 128)
[2025-12-04 18:18:20,800] [ INFO] - ..creating kv cache for layer 11: (150, 2, 64, 128)
[2025-12-04 18:18:20,802] [ INFO] - ..creating kv cache for layer 12: (150, 2, 64, 128)
[2025-12-04 18:18:20,803] [ INFO] - ..creating kv cache for layer 13: (150, 2, 64, 128)
[2025-12-04 18:18:20,804] [ INFO] - ..creating kv cache for layer 14: (150, 2, 64, 128)
[2025-12-04 18:18:20,806] [ INFO] - ..creating kv cache for layer 15: (150, 2, 64, 128)
[2025-12-04 18:18:20,807] [ INFO] - ..creating kv cache for layer 16: (150, 2, 64, 128)
[2025-12-04 18:18:20,809] [ INFO] - ..creating kv cache for layer 17: (150, 2, 64, 128)
CUDA error 209 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 83]: no kernel image is available for execution on the device
CUDA error 101 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 102]: invalid device ordinal
/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/utils/decorator_utils.py:420: Warning:
Non compatible API. Please refer to https://www.paddlepaddle.org.cn/documentation/docs/en/develop/guides/model_convert/convert_from_pytorch/api_difference/torch/torch.max.html first.
CUDA error 209 [/paddle/third_party/cccl/cub/cub/util_device.cuh, 83]: no kernel image is available for execution on the device
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4653206000938731956.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:38 Assertion `id >= 0` failed. Id should no less than 0 but received an id value: -4668179459530540121.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4396825710625478315.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4557143310675426650.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Error: /paddle/paddle/phi/kernels/gpu/embedding_kernel.cu:41 Assertion `id < N` failed. Id should smaller than 103424 but received an id value: 4516835107029230021.
Traceback (most recent call last):
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/engine/../worker/worker_process.py", line 868, in <module>
run_worker_proc()
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/engine/../worker/worker_process.py", line 853, in run_worker_proc
worker_proc.initialize_kv_cache()
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/engine/../worker/worker_process.py", line 374, in initialize_kv_cache
available_kv_cache_memory = self.worker.determine_available_memory()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/worker/gpu_worker.py", line 136, in determine_available_memory
self.model_runner.profile_run()
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/worker/gpu_model_runner.py", line 2249, in profile_run
self._dummy_run(
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/worker/gpu_model_runner.py", line 1745, in _dummy_run
model_output = self.model(
^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 657, in forward
hidden_states = self.ernie(ids_remove_padding=ids_remove_padding, forward_meta=forward_meta)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/graph_optimization/decorator.py", line 68, in __call__
return self.graph_opt_backend(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/graph_optimization/graph_optimization_backend.py", line 145, in __call__
return self.dy_runnable(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 473, in forward
hidden_states, residual = self.layers[i](forward_meta, hidden_states, residual)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 366, in forward
hidden_states = self.self_attn(
^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/models/ernie4_5_moe.py", line 286, in forward
qkv_out = self.qkv_proj(hidden_states)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/nn/layer/layers.py", line 1580, in __call__
return self.forward(*inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/layers/linear.py", line 244, in forward_cuda
linear_out = self.quant_method.apply(self, x)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/fastdeploy/model_executor/layers/linear.py", line 72, in apply
linear_out = paddle.matmul(x, layer.weight)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/fastdeploy/lib/python3.12/site-packages/paddle/base/dygraph/generated_tensor_methods_patch.py", line 67, in _matmul
return _C_ops.matmul(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OSError: (External) CUDA error(719), unspecified launch failure.
[Hint: 'cudaErrorLaunchFailure'. An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointerand accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases canbe found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work willreturn the same error. To continue using CUDA, the process must be terminated and relaunched.] (at /paddle/paddle/phi/core/platform/device/gpu/gpu_info.cc:127)
/root/miniconda3/envs/fastdeploy/lib/python3.12/multiprocessing/resource_tracker.py:279: UserWarning: resource_tracker: There appear to be 2 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Metadata
Metadata
Assignees
Labels
No labels