You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@SkyAndFly hi, it seems the machine's memory may have been exceeded. It is recommended to change to a machine with larger memory. There is no operator using GPU in your configuration file. You can use a CPU machine with larger memory. If the operator has the configuration "_accelerator = 'cuda'", the GPU can be used.
In addition, each operator can be configured with the number of multi-processes separately. Add the num_proc parameter in the operator configuration for separate configuration.
Before Asking 在提问之前
I have read the README carefully. 我已经仔细阅读了 README 上的操作指引。
I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking 先搜索,再提问
Question
运行环境:阿里云PAI-DSW。8核 32G 显存24G
我使用的yaml文件在下面附上。当我执行
python ./data-juicer-main/tools/process_data.py --config zhihu-bot.yaml
在运行完第一个算子numeric_field_filter_process完成后,开始进行第二个算子text_length_filter,在出现后Adding new column for stats (num_proc=8): 0%| 后卡死,远程notebook直接丢失了连接,只能重启。
卡死前会注意到CPU和内存占用率上升。我尝试过把np调节为4之后,这个问题不会出现。但是对于一些不需要花费很大性能的算子来说,处理的时间会变长。因此,有没有什么配置可以解决这个问题,能够最好的使用到全部性能来处理?我在阅读文档后确实没有找到相关信息。
yaml文件如下
zhihu-bot.yaml.txt
log文件如下
export_zhihu_refine.jsonl_time_20250122153602.txt
log中有一些打印信息是我修改了代码,试图找出卡死位置。
Additional 额外信息
No response
The text was updated successfully, but these errors were encountered: