網(wǎng)站首頁編程語言正文

關(guān)于torch.load加載預(yù)訓(xùn)練模型時(shí) 造成的臨時(shí)分配的顯存不釋放

作者：DeathYmz 更新時(shí)間： 2022-10-29 編程語言

今天跑一個(gè)模型的時(shí)候，需要加載部分預(yù)訓(xùn)練模型的參數(shù)，這期間遇到使用torch.load 忽略了 map_location參數(shù) 默認(rèn)gpu，這導(dǎo)致這個(gè)變量分配的顯存不釋放然后占用大量資源 gpu資源不能很好的利用。

問題講解：

比如我們一般我們會(huì)使用下面方式進(jìn)行加載預(yù)訓(xùn)練參數(shù) 到自身寫的模型中：

from transformers import RobertaForMultipleChoice
import torch
model = RobertaForMultipleChoice.from_pretrained("roberta-large")
pretrained_model = torch.load("./checkpoints/txt_matching_e1.pth").roberta
pretrained_dict = pretrained_model.state_dict()
model_dict = model.roberta.state_dict()
# pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict} #去除一些不需要的參數(shù)
model_dict.update(pretrained_dict)
model.roberta.load_state_dict(model_dict)

1. 當(dāng)我們沒有使用參數(shù)時(shí)候 load 默認(rèn)使用了一塊顯卡然后報(bào)錯(cuò)

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 3; 10.76 GiB total capacity; 350.54 MiB already allocated; 21.81 MiB free; 356.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
在這里插入圖片描述

torch load 之前gpu使用

torch load 之后 outof memory 了并且也不釋放

2. 當(dāng)我們沒有使用參數(shù)時(shí)候 load 默認(rèn)使用了一塊顯卡然后報(bào)錯(cuò)
當(dāng)我試試指定顯卡 gpu會(huì)使用2841
pretrained_model = torch.load(“./checkpoints/txt_matching_e1.pth”,map_location=‘cuda:0’).roberta!

在這里插入圖片描述

(model 直接cuda 的gpu 占用情況)
在這里插入圖片描述 然后把這里面參數(shù)給model，并且model也是用cuda0 然后gpu使用4193

你可能會(huì)想model是不是model load預(yù)訓(xùn)練參數(shù)之后就這么大了那么load 和 load 參數(shù)后model 用不同gpu看看。
在這里插入圖片描述

原理：cuda的內(nèi)存管理機(jī)制
參考解釋博客：Pytorch訓(xùn)練模型時(shí)如何釋放GPU顯存

解決方案：

1. 不占用顯存的使用方法，使用cpu 然后在del 用gc釋放內(nèi)存

model = RobertaForMultipleChoice.from_pretrained("roberta-large")
pretrained_model = torch.load("./checkpoints/txt_matching_e1.pth",map_location='cpu').roberta
pretrained_dict = pretrained_model.state_dict()
model_dict = model.roberta.state_dict()
model_dict.update(pretrained_dict)
model.roberta.load_state_dict(model_dict)
del pretrained_model
import gc
gc.collect()

2. 合理使用， torch.cuda.empty_cache()

這個(gè)需要了解一下python的內(nèi)存管理，引用機(jī)制。
比如我pretrain_model 給model直接加載參數(shù)，model和pretrain_model 都在cuda:0上，使用torch.cuda.empty_cache() 不能釋放pretrain_model 的顯存。
當(dāng) 我把model 放到 cuda：1上（本來在cuda:0）,這時(shí)候用torch.cuda.empty_cache() 可以釋放。
在這里插入圖片描述

原文鏈接：https://blog.csdn.net/Miranda_ymz/article/details/127577639

上一篇：Clickhouse通過命令導(dǎo)入導(dǎo)出文件（在Linux命令窗
下一篇：Pytorch訓(xùn)練模型時(shí)如何釋放GPU顯存 torch.cu

日本免费高清视频-国产福利视频导航-黄色在线播放国产-天天操天天操天天操天天操|www.shdianci.com

網(wǎng)站首頁編程語言正文

關(guān)于torch.load加載預(yù)訓(xùn)練模型時(shí) 造成的臨時(shí)分配的顯存不釋放

問題講解：

相關(guān)推薦

日本免费高清视频-国产福利视频导航-黄色在线播放国产-天天操天天操天天操天天操|www.shdianci.com

網(wǎng)站首頁 編程語言 正文

關(guān)于torch.load加載預(yù)訓(xùn)練模型時(shí) 造成的 臨時(shí)分配的顯存 不釋放

問題講解：

相關(guān)推薦

網(wǎng)站首頁編程語言正文

關(guān)于torch.load加載預(yù)訓(xùn)練模型時(shí) 造成的臨時(shí)分配的顯存不釋放