-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using historical build information for encapsulation #4247
Comments
I would need to read the packaging structure of previous builds to prevent a major packaging structure change. Since the previous builds (latest - 1, latest - 2), don't necessarily have tags attached to them in quay, I could ask the user to provide the versions (latest- 1 or latest -2). But this does not suffice the need to query fcos release browser because I would still need to use container-img-proxy to open the image and get the manifest. Since c/img-proxy forks skopeo as a process, skopeo needs to be able to understand the the fcos-release-browser versions. This potentially block on fedora-coreos-tracker/issues/#1367 Is there a potentially better way of getting the manifests of older builds from skopeo? |
This topic heavily relates to coreos/fedora-coreos-tracker#1367 First though, I think this technology should not be specific to Fedora CoreOS. At least, not at a low level. So my strawman proposal would look something like this
where basically a set of direct previous builds are provided, and we aim to optimize the delta from those. In Fedora CoreOS we've commonly wrapped |
Upon encapsulating a new rpm-ostree commit, there are four possible changes that can happen to the underlying packages:
This will influence the structure of the way layers of the image are packed:
This is because the most common changes that happen between builds are:
This last bin will be emptied once the bodhi-scraper causes a new packing structure to be implemented every major release. This way the most optimal packing structure can be used every major release and in between two major releases, the packing structure is unchanged and new packages are dumped into the last bin. |
To implement the functionality above, I would need to parse nevra of packages in ostree-rs-ext. There exists |
Hmmm. Two things:
|
I think we can split it...how about starting like this?
Basically Then the rest of the code needs to consider |
This is a prep PR to completing coreos#4247. It allows one to diff the layers of current encapsulated build to any other build.
This is a prep PR to completing coreos#4247. It allows one to diff the layers of current encapsulated build to any other build.
High Level Design of the data flow: There are two "sources" to get metadata about package updates from:
These sources will be modified to either have an API or a file (frequemcyupdateinfometadata.json) in their repodata that contains the metadata about the updates. This metadata has the exact similar format as the result from Query API. It contains the list of all updates present in the current and pending release state with a stable update status and has reduced fields (as needed by the post-processing). A time based cache will be inserted in this layer to prevent latency This metadata generated from bodhi and errata has the same format, so that it can be consumed by the same post-processor to generate a updates_frequency file for each of them respectively. The updates_frequency_rhcos.json and updates_frequency_fcos.json file then will be committed to fcos-config and rhcos-config respectively. Either RPM-OStree or COSA can then fetch and read this file to serve the purpose of heuristic for encapsulation |
From my perspective, this list is hard to read and would benefit from a un-de-duplication to be more explicit about the steps and difference between RHCOS & FCOS. Note that we also have SCOS. |
Ok updated |
After consultation with the libdnf team, a new approach for getting the list of updates made to the packages was considered. This involves using RPM Tag data API to read the rpm header data for changelog timestamps. This array of timestamps "record the changes that have happened to the package between different Version or Release builds". Hence they are build times of each update. Cons of this approach:
Pros:
Hence, discarding all of the OS specific efforts in packaging:
|
Part of 4012 Solution 0
Here we take external data using historical build information and the external process decides on the chunking.
The text was updated successfully, but these errors were encountered: