Releases: cloudfoundry/cf-networking-release
0.19.0
The first release to include a new layer-3 only CNI plugin. Highlights include:
- Silk CNI plugin to replace Flannel CNI plugin
- NetIn and NetOut rules are configured through CNI
- Networking features to enable BOSH DNS for CF apps
We do not recommend using cf-networking-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com.
Take a look at known issues for current limitations and known issues. Verified with the following:
Manifest Changes
Changed Properties
- The value for
cf_networking.garden_external_networker.cni_plugin_dir
must be updated to/var/vcap/packages/silk/bin
if you are not swapping out CNI with your own plugin. (There is no default currently, but we plan to add one in the next release) - The property for global ASG logging has changed from
cf_networking.garden_external_networker.iptables_asg_logging
tocf_networking.iptables_asg_logging
.
Removed Properties
cf_networking.flannel_watchdog.no_bridge
is now removed.
New Properties
A new property has been added to support an upcoming feature. Users can specify DNS servers and access will be automatically allowed for link-local DNS servers:
cf_networking.dns_servers
The new feature will require garden-runc-release versions >=1.4.0.
Significant Changes
New CNI plugin
- CF Wrapper plugin fails if there is a subnet theft
- CF Networking Release can use the Silk CNI plugin instead of the flannel + bridge plugins
- Flannel watchdog has a bridgeless mode where it inspects the the container metadata store
- An acceptance environment is running a BOSH deployed silkd
NetIn/NetOut Changes
- Wrapper CNI plugin can configure NetIn and NetOut
- The external networker defers to the CNI plugin to write NetIn/NetOut rules
BOSH DNS support
- An iptables input rule is written for every local DNS server
- DNS servers are returned from the external networker to garden - Requires garden-runc-release versions >1.3.0
Logging enhancements
- Logging for denied outbound non-c2c packets
- As an operator I know how to find the source app using a packet capture
- ASG deny logging is rate limited to a hardcoded interval
- Troubleshooting docs include information about ASG logging through BOSH property
Chores
0.18.0
Lots of good stuff in this release. Highlights include:
- Logging for c2c iptables can be enabled through a BOSH property
- Container networking scales to 20K application instances with 3 policies per application.
- Initial support for logging ASG iptables through a BOSH property. ASG logs will be prefixed with
OK_
orDENY_
. - If you are running Diego release v1.10.1 you must upgrade to this release
We do not recommend using cf-networking-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com.
Take a look at known issues for current limitations and known issues.
Verified with the following:
New Manifest Properties
cf_networking.rep_listen_addr_admin
enables our drain scripts to wait for the Diego rep to exit.
It should always be the same value asdiego.rep.listen_addr_admin
. It defaults to127.0.0.1:1800
.cf_networking.garden_external_networker.iptables_asg_logging
globally enables iptables logging for
all ASGs, including logging of denied packets. Defaults to false.cf_networking.vxlan_policy_agent.iptables_c2c_logging
enables iptables logging for
container-to-container traffic. It defaults tofalse
. Note: this is already
configurable at runtime.cf_networking.plugin.health_check_port
allows BOSH to better health-check theflanneld
process
required for connectivity.
Removed Manifest Properties
cf_networking.policy_server.database.connection_string
was deprecated in v0.10.0 and is now removed.
Significant Changes
Scalability
- container networking is reliable with 20k app instances across 100 diego cells
- Scalability test for popular server
- Our docs include recommendations on scaling policy server instances and DB
- The policy server can handle our scalability target of 20K AIs
Upgrades
Manifest Changes
Security
Chores
- Investigate and fix "Ginkgo timed out waiting for parallel nodes to report back"
- Improve stop behavior of monit ctl scripts
Stability
- Flannel has a healthcheck endpoint for monit
- A cell with a subnet mismatch can be recovered by a BOSH restart of the cell
- Policy server monit script checks a healthcheck endpoint
Logging
- Logging for c2c iptables is configurable through a BOSH property
- Logging for denied outbound non-c2c packets
Internal integration
0.17.0
This release reduces the Flannel subnet lease renewal interval to alleviate the effects of etcd failures. It also includes a manifest change. Take a look at the manifest change log for details.
We do not recommend using cf-networking-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com.
Take a look at known issues for current limitations and known issues.
Verified with the following:
Significant Changes
Flannel
- Document risks and mitigations for container networking when etcd disappears and comes back with an empty data dir
- Flannel subnet range for a cell should be configurable
Scalability
- Policy server flakes when trying to add/delete several thousand policies
- As an operator I have metrics to help evaluate policy server performance
- As a space developer, I expect list policies to work when there are a lot of policies/apps
- Our docs include recommendations on scaling policy server instances and DB
- When policies are requested by ID, policy server does not query database for all policies
Chores
0.16.0
No big manifest changes in this release - key changes include a property to override the interface MTU, policy cleanup for deleted applications and spaces and CLI enhancements.
We do not recommend using cf-networking-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com. Take a look at known issues for current limitations and known issues.
Verified with the following:
Significant Changes
Manifest Changes
Policy Cleanup
CLI
- Update CATS and runtime-ci docker image to use
remove-access
- As a space developer I get a meaningful error message when I don't have network.write scope and try to configure a policy
Security
Documentation
Metrics
- As an operator I can set up an alert for when my cell has a Flannel watchdog error
- Investigate increased policy server response time on toque
- As an operator I have metrics to help evaluate policy server performance
Miscellaneous
0.15.0
This release includes significant manifest changes. Please take a look at the manifest changelog for details.
We do not recommend using cf-networking-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com. Take a look at known issues for current limitations and known issues.
Verified with the following:
Significant Changes
Manifest Changes
- Simplify / consolidate BOSH properties
- Rename netman to cf-networking for all container networking artifacts
Policy Cleanup
CLI
Chores
0.14.0
Netman is no more! The key change in this release is a rename from netman
to cf-networking
. This change is documented in the manifest changelog. At this point, there are no changes to manifest properties other than the release name.
We do not recommend using cf-networking-release
in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com. Take a look at known issues for current limitations and known issues.
Verified with the following:
Significant Changes
Manifest Changes
Performance
- Compare c2c networking latency vs. router latency
- As PM I would like a measure of latency to apply a single additional policy with a number of existing policies
- As PM I would like to know how the effect of having a large ASG config on adding a single policy
Troubleshooting
- CF CLI Plugin respects CF_TRACE environment variable
- As a space developer I get a meaningful error message when I don't have network.write scope and try to configure a policy
- Add a Troubleshooting page for CF networking
Chores
0.13.0
Key changes include support for self-service space developer configuration. A user can now request a network.write
scope to configure policies for spaces where they have Space Developer privileges.
We do not recommend using netman-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com. Take a look at known issues for current limitations and known issues.
Verified with the following:
Significant Changes
Self-service for space developers
Scalability and performance
0.12.0
Key changes include configurable subnet ranges and masks, self service policy configuration and enhancements for reducing policy enforce time.
We do not recommend using netman-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com. Take a look at known issues for current limitations and known issues.
Verified with the following:
Significant Changes
Deployment Changes
- The subnet and mask for the overlay network is configurable
- As an operator I would like to support more than 254 cells with legacy networking features
Space Developer self-service policy configuration
- Space developers with network.write scope can create policies using the API for apps in spaces they own
- Space developers with network.write scope can delete policies for apps in spaces they own
UX changes
Performance and Scalability
- As an operator I don't expect iptables to be rewritten continuously when there are no policy changes
Miscellaneous Changes
0.11.0
Key changes include logging enhancements and UX changes to the DELETE and GET APIs.
We do not recommend using netman-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com.
Take a look at known issues for current limitations and known issues. Verified with the following:
Significant Changes
Deployment Changes
Logging
- When a c2c connection between containers is allowed, I see a log line in syslog at the destination
- As an operator I can get all logs related to netman by using a keyword
- Reduce log message noise due to missing policy_group_id
UX changes
0.10.0
Key changes include manifest changes related to policy server DB configuration, logging enhancements and testing related to data plane security.
We do not recommend using netman-release in production yet, but give it a try and give us your feedback in the #container-networking channel on cloudfoundry.slack.com.
Verified with the following:
Significant Changes
Manifest Changes
Logging
- Log levels for vxlan-policy-agent are reconfigurable at runtime
- Logging for c2c iptables is reconfigurable at runtime
- Log levels for policy-server are reconfigurable at runtime
Security
- Move flannel state dir to something under /var/vcap
- As an attacker my containers can reach local addresses on the host VM
- Redact tokens/passwords in policy server log messages
Miscellaneous
- netman-release has a NOTICE file with license information
- Containers can be created while policy server is down and receive traffic when the policy-server comes back up
- Masquerade rule should be written by something other than vxlan-policy-agent
- SPIKE: Containers can connect to an IP address on the host