Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pytorch_vision_resnext 번역 #55

Merged
merged 4 commits into from
Apr 2, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 21 additions & 23 deletions pytorch_vision_resnext.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,23 +25,21 @@ model = torch.hub.load('pytorch/vision:v0.10.0', 'resnext50_32x4d', pretrained=T
model.eval()
```

All pre-trained models expect input images normalized in the same way,
i.e. mini-batches of 3-channel RGB images of shape `(3 x H x W)`, where `H` and `W` are expected to be at least `224`.
The images have to be loaded in to a range of `[0, 1]` and then normalized using `mean = [0.485, 0.456, 0.406]`
and `std = [0.229, 0.224, 0.225]`.

Here's a sample execution.
모든 사전 훈련된 모델들은 입력 이미지가 동일한 방식으로 정규화되었다고 상정합니다.
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
즉, 미니 배치(mini-batch)의 3-채널 RGB 이미지들은 `(3 x H x W)`의 형태를 가지며, 해당 `H`와 `W`는 최소 `224` 이상이어야 합니다.
각 이미지는 `[0, 1]`의 범위 내에서 로드되어야 하며, `mean = [0.485, 0.456, 0.406]` 과 `std = [0.229, 0.224, 0.225]`을 이용해 정규화되어야 합니다.
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
다음은 실행 예제 입니다.

```python
# Download an example image from the pytorch website
# 파이토치 웹 사이트에서 다운로드한 이미지 입니다.
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
```

```python
# sample execution (requires torchvision)
# 예시 코드 (torchvision 필요)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
Expand All @@ -52,51 +50,51 @@ preprocess = transforms.Compose([
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
input_batch = input_tensor.unsqueeze(0) # 모델에서 가정하는대로 미니배치 생성
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved

# move the input and model to GPU for speed if available
# gpu를 사용할 수 있다면, 속도를 위해 입력과 모델을 gpu로 옮김
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
model.to('cuda')

with torch.no_grad():
output = model(input_batch)
# Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes
# output은 shape가 [1000]인 Tensor 자료형이며, 이는 Imagenet 데이터셋의 1000개의 각 클래스에 대한 모델의 확신도(confidence)를 나타냅니다.
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
print(output[0])
# The output has unnormalized scores. To get probabilities, you can run a softmax on it.
# output은 정규화되지 않았으므로, 확률화하기 위해 softmax 함수를 처리합니다.
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
probabilities = torch.nn.functional.softmax(output[0], dim=0)
print(probabilities)
```

```
# Download ImageNet labels
# ImageNet 데이터셋 레이블 다운로드
!wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt
```

```
# Read the categories
# 카테고리(클래스) 읽기
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]

# Show top categories per image
# 각 이미지에 대한 top 5 카테고리 출력
top5_prob, top5_catid = torch.topk(probabilities, 5)

for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
```

### Model Description
### 모델 설명

Resnext models were proposed in [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431).
Here we have the 2 versions of resnet models, which contains 50, 101 layers repspectively.
A comparison in model archetechure between resnet50 and resnext50 can be found in Table 1.
Their 1-crop error rates on imagenet dataset with pretrained models are listed below.
Resnext 모델은 논문 [Aggregated Residual Transformations for Deep Neural Networks]에서 제안되었습니다. (https://arxiv.org/abs/1611.05431).
여기서는 50개의 레이어와 101개의 레이어를 가지는 2개의 resnet 모델을 제공하고 있습니다.
JunyongKang marked this conversation as resolved.
Show resolved Hide resolved
resnet50과 resnext50의 아키텍처 차이는 논문의 Table 1을 참고하십시오.
ImageNet 데이터셋에 대한 사전훈련된 모델의 에러(성능)은 아래 표와 같습니다.

| Model structure | Top-1 error | Top-5 error |
| 모델 구조 | Top-1 오류 | Top-5 오류 |
| ----------------- | ----------- | ----------- |
| resnext50_32x4d | 22.38 | 6.30 |
| resnext101_32x8d | 20.69 | 5.47 |

### References
### 참고문헌

- [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431)
- [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431)