Releases: Netflix/Priam
Releases · Netflix/Priam
Graceful Shutdown of Cassandra
New Features
- (#664) Cassandra Process Manager can be configured to gracefully stop using the new
gracefulDrainHealthWaitSeconds
option. If this option set to a positive integer (>=0) then before calling
the shutdown script, Priam will fail healthchecks (InstanceState.isHealthy
) for the configured number of seconds and then will issue anodetool drain
with 30s timeout (since drain can hang), and finally call the provided stop script. By default this is set to-1
to disable this feature for backwards compatibility. This is useful if you want to gracefully drain cassandra clients off a node before runningdrain
(which kills the Native/Thrift server and resets and tcp connections that were established; in flight requests can get dropped), then running drain to safely stop Cassandra, and then call your stop script. If your service discovery system does not integrate with Priam's health system or your stop script already does all these things then leave this functionality disabled. - (#664)
/v1/cassadmin/stop
http API call now takes an optionalforce
parameter (e.g./v1/cassadmin/stop?force=true
which will skip the graceful path for that particular stop; default value isfalse
. - (#650) Enable auth on the jmx port via
jmxUsername
andjmxPassword
options. By default these are null and not provided.
Bug Fixes
- (#659) Fix to
Snapshotstatus
to actually containbkupMetadata
- (#661) Update
commons-io
,aws-java-sdk
,snakeyaml
Breaking changes
- (#664) If you previously implemented
ICassandraProcess
internally thestart
method has been refactored to take aboolean force
parameter. If you implement this interface you can supplyfalse
to preserve previous behavior.
Backup status bug fix
Eliminate assumption that existence of an element in a data structure means successful backup.
Backup status bug fix
Eliminate assumption that existence of an element in a data structure means successful backup.
Autoremediate Refactor
New Features
Bugs
Autoremediate Refactor
- Autostart functionality now uses timers instead of ratelimiters so that the first autostart does not start until an interval after the first start.
Metrics for Cassandra Process Manager and bug fixes
New Features
- Cassandra Process Manager and Monitor now record metrics when C* is stopped, started or auto-started with recent autorestart functionality.
- Location of backup status file is now configurable via configuration
priam.backup.status.location
. - SDBInstance for token management with default binding to us-east-1 but configurable via
priam.sdb.instanceIdentity.region
.
Bugs
- Exclude duplicate sl4j module binding.
- Shut down quartz at application stop
Gradle 4.4 and Autoremediate Bugfixes
- Gradle 4.4 Support
- Autostart functionality now only sets shouldCassandraBeAlive flag from
the start api to prevent a race against the stop API in the monitoring
thread.
Gradle 4.4 and Autoremediate Bugfixes
- Gradle 4.4 Support
- Autostart functionality now only sets shouldCassandraBeAlive flag from
the start api to prevent a race against the stop API in the monitoring
thread.
Priam autoremediates dead Cassandra
Bugs
- None
New Features
- Priam will now automatically restart Cassandra if it fails. If you use
Priam to stop Cassandra (via the API) it will not automatically restart
Cassandra until a subsequent start via the API. You can control this
via thepriam.remediate.dead.cassandra.rate
configuration option. If
negative it disables auto-remediation, if zero it immediately auto-remediates
on any failure, and if a positive integer the auto-remediation waits for
that number of seconds between restarts. The default is 360 seconds
(one hour).
Breaking Changes
- None
Priam autoremediates dead Cassandra
Bugs
- None
New Features
- Priam will now automatically restart Cassandra if it fails. If you use
Priam to stop Cassandra (via the API) it will not automatically restart
Cassandra until a subsequent start via the API. You can control this
via thepriam.remediate.dead.cassandra.rate
configuration option. If
negative it disables auto-remediation, if zero it immediately auto-remediates
on any failure, and if a positive integer the auto-remediation waits for
that number of seconds between restarts. The default is 360 seconds
(one hour).
Breaking Changes
- None