layout | background-class | body-class | title | summary | category | image | author | tags | github-link | github-id | featured_image_1 | featured_image_2 | accelerator | order | demo-model-link | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
hub_detail |
hub-background |
hub |
vgg-nets |
Award winning ConvNets from 2014 Imagenet ILSVRC challenge |
researchers |
vgg.png |
Pytorch Team |
|
pytorch/vision |
vgg.png |
no-image |
cuda-optional |
10 |
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg11', pretrained=True)
# μΆκ°λ‘ μλμ κ°μ΄ λ³νλ ꡬ쑰μ λͺ¨λΈλ€μ΄ μμ΅λλ€
# model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg11_bn', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg13', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg13_bn', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg16', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg16_bn', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg19', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'vgg19_bn', pretrained=True)
model.eval()
λͺ¨λ μ¬μ νλ ¨λ λͺ¨λΈμ νλ ¨λμ κ°μ λ°©μμΌλ‘ μ κ·νλ μ
λ ₯ μ΄λ―Έμ§λ₯Ό μ£Όμ΄μΌν©λλ€.
μ¦, (3 x H x W)
λͺ¨μμ 3μ±λ RGB μ΄λ―Έμ§μ λ―Έλλ°°μΉμμ H
μ W
λ μ΅μ 224
κ° λ κ²μΌλ‘ μμλ©λλ€.
μ΄λ―Έμ§λ [0, 1]
λ²μλ‘ λ‘λν λ€μ(RGB μ±λλ§λ€ 0~255κ°μΌλ‘ ννλλ―λ‘ μ΄λ―Έμ§λ₯Ό 255λ‘ λλ) mean = [0.485, 0.456, 0.406]
κ³Ό std = [0.229, 0.224, 0.225]
κ°μ μ¬μ©νμ¬ μ κ·νν΄μΌ ν©λλ€.
λ€μμ μν μ€νμ λλ€.
# νμ΄ν μΉ μΉμ¬μ΄νΈμμ μμ μ΄λ―Έμ§λ₯Ό λ€μ΄λ‘λ ν©λλ€
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
# μν μ€ν (torchvisionμ΄ νμν©λλ€)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # λͺ¨λΈμ μ
λ ₯κ°μ λ§μΆ λ―Έλ λ°°μΉ μμ±
# κ°λ₯νλ©΄ μλλ₯Ό μν΄μ μ
λ ₯κ³Ό λͺ¨λΈμ GPUλ‘ μ΄λ ν©λλ€
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
model.to('cuda')
with torch.no_grad():
output = model(input_batch)
# Imagenetμ 1000κ° ν΄λμ€μ λν μ λ’°λ μ μκ° μλ 1000κ°μ Tensorμ
λλ€.
print(output[0])
# μΆλ ₯μ μ κ·νλμ§ μμ μ μκ° μμ΅λλ€. νλ₯ μ μ»μΌλ €λ©΄ μννΈλ§₯μ€λ₯Ό μ€νν μ μμ΅λλ€.
probabilities = torch.nn.functional.softmax(output[0], dim=0)
print(probabilities)
# ImageNet λΌλ²¨ λ€μ΄λ‘λ
!wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt
# μΉ΄ν
κ³ λ¦¬ μ½κΈ°
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Show top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
κ° κ΅¬μ± λ° bachnorm λ²μ μ λν΄μ Very Deep Convolutional Networks for Large-Scale Image Recognitionμμ μ μν λͺ¨λΈμ λν ꡬνμ΄ μμ΅λλ€.
μλ₯Ό λ€μ΄, λ
Όλ¬Έμ μ μλ κ΅¬μ± A
λ vgg11
, B
λ vgg13
, D
λ vgg16
, E
λ vgg19
μ
λλ€.
batchnorm λ²μ μ _bn
μ΄ μ λ―Έμ¬λ‘ λΆμ΄μμ΅λλ€.
μ¬μ νλ ¨λ λͺ¨λΈμ΄ μλ imagenet λ°μ΄ν° μΈνΈμ 1-crop μ€λ₯μ¨μ μλμ λμ΄λμ΄ μμ΅λλ€.
Model structure | Top-1 error | Top-5 error |
---|---|---|
vgg11 | 30.98 | 11.37 |
vgg11_bn | 26.70 | 8.58 |
vgg13 | 30.07 | 10.75 |
vgg13_bn | 28.45 | 9.63 |
vgg16 | 28.41 | 9.62 |
vgg16_bn | 26.63 | 8.50 |
vgg19 | 27.62 | 9.12 |
vgg19_bn | 25.76 | 8.15 |