Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

koordlet: support hugepage reporting #1744

Merged
merged 1 commit into from
Dec 14, 2023

Conversation

peiqiaoWang
Copy link
Contributor

Ⅰ. Describe what this PR does

支持了大页上报的功能,包括2M和1G大页

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

可通过k8s kubectl get noderesourcetopologies 查看

Ⅳ. Special notes for reviews

支持了大页内存的上报,其中原本的内存memory上送的数目也进行了修改,将上报内存 设置为 总内存大小 减去 大页内存的数量。

func (s *nodeTopoInformer) calTopologyZoneList(nodeCPUInfo *metriccache.NodeCPUInfo) 需重点看下。

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

Copy link

codecov bot commented Nov 14, 2023

Codecov Report

Attention: 34 lines in your changes are missing coverage. Please review.

Comparison is base (b6a7d9e) 66.08% compared to head (3e8552a) 66.32%.
Report is 18 commits behind head on main.

Files Patch % Lines
pkg/koordlet/util/meminfo.go 74.54% 20 Missing and 8 partials ⚠️
pkg/koordlet/util/system/system_file.go 0.00% 4 Missing ⚠️
...statesinformer/impl/states_noderesourcetopology.go 91.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1744      +/-   ##
==========================================
+ Coverage   66.08%   66.32%   +0.23%     
==========================================
  Files         388      395       +7     
  Lines       42373    43536    +1163     
==========================================
+ Hits        28003    28874     +871     
- Misses      12296    12512     +216     
- Partials     2074     2150      +76     
Flag Coverage Δ
unittests 66.32% <75.71%> (+0.23%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@saintube saintube changed the title feature: support hugepage koordlet: support hugepage reporting Nov 15, 2023
pkg/koordlet/util/meminfo.go Show resolved Hide resolved
pkg/koordlet/util/meminfo.go Outdated Show resolved Hide resolved
pkg/koordlet/util/meminfo.go Outdated Show resolved Hide resolved
pkg/koordlet/util/meminfo.go Outdated Show resolved Hide resolved
@saintube
Copy link
Member

/lgtm
PTAL /cc @zwzhang0107

@hormes
Copy link
Member

hormes commented Nov 17, 2023

@peiqiaoWang please add a feature-gate for it, thanks.

@zwzhang0107
Copy link
Contributor

I think we need to clarify whether koordlet needs to do something on huge-page considering NUMA (i.e. set cpuset.mems or not set yet)

@peiqiaoWang
Copy link
Contributor Author

@zwzhang0107 kubelet will do this thing that set value on hugetlb.1GB.limit_in_bytes.

@zwzhang0107
Copy link
Contributor

zwzhang0107 commented Dec 6, 2023

@zwzhang0107 kubelet will do this thing that set value on hugetlb.1GB.limit_in_bytes.

@peiqiaoWang yep, i mean the cpuset.mems. We should add some comments to clarify whether limit the hugepages on single numa node.

@peiqiaoWang
Copy link
Contributor Author

@zwzhang0107 kubelet will do this thing that set value on hugetlb.1GB.limit_in_bytes.

@peiqiaoWang yep, i mean the cpuset.mems. We should add some comments to clarify whether limit the hugepages on single numa node.

这个功能只支持大页上报,设置numa信息由调度器分配,然后作用到pod是由csi根据调度器分配的numa信息把预分配大页绑上去,此处是否设置绑定namespace的numa使用关系倒是不大

@peiqiaoWang
Copy link
Contributor Author

@zwzhang0107 kubelet will do this thing that set value on hugetlb.1GB.limit_in_bytes.

@peiqiaoWang yep, i mean the cpuset.mems. We should add some comments to clarify whether limit the hugepages on single numa node.

这个功能只支持大页上报,设置numa信息由调度器分配,然后作用到pod是由csi根据调度器分配的numa信息把预分配大页绑上去,此处是否设置绑定namespace的numa使用关系倒是不大

@zwzhang0107

// This feature supports reporting of hugepages.
// The koord-scheduler will allocate hugepage information based on the user's hugepage request and add it to the Pod's annotations.
// Format: scheduling.koordinator.sh/resource-status: '{"numaNodeResources":[{"node":1,"resources":{"hugepages-1Gi":"50Gi"}}]}'.
// Backend applications can enable the hugepages based on the allocation results.
// For example, the CSI mounts the pre-allocated hugepages into the pod.

Add some comments in pkg/features/koordlet_features.go

@zwzhang0107
Copy link
Contributor

/lgtm

@zwzhang0107
Copy link
Contributor

/approve

@saintube
Copy link
Member

/lgtm

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hormes, zwzhang0107

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit c9d5286 into koordinator-sh:main Dec 14, 2023
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants