Releases: gogama/incite
Releases · gogama/incite
Dynamically adapt to noisy neighbors
Big Features:
QueryManager
now dynamically adapts to noisy neighbor problems (and resource constraint issues) such as being throttled by the CloudWatch Logs Insights API or failing to invoke the CloudWatch Logs Insights API's StartQuery operation due to exceeding the query concurrency limit. Throttling and concurrency limit problems are often caused by other users of the API (noisy neighbors), usually operating within the same AWS customer account. Dynamic adaptation both helps avoid overuse of the Insights API and prevents Incite queries from failing with errors simply because a given chunk is throttled too many times. This makes big, long-running, Incite queries more robust. Dynamic adaptation is on by default.
API changes:
- Added new
DisableAdaptation
field toConfig
structure used to construct a newQueryManager
. This field allows you to disable the new dynamic adaptive behavior if desired.
Improvements:
- None.
Maintenance release (bug fix)
API changes:
- None.
Big Features:
- None
Improvements:
- Fix bug where dynamic chunk splitting (i.e.
QuerySpec.SplitUntil
) could result in an error from the CloudWatch Logs Insights service likeInvalidParameterException: End time cannot be less than Start time
, resulting in the whole query stream being aborted. (See issue #25).
Maintenance release (small improvements)
API changes:
- None.
Big Features:
- None
Improvements:
- Add Incite library name to User-Agent header when sending requests to CloudWatch Logs API. (issue #18)
- Add Incite version to User-Agent header as well, when it is available. (issue #18)
- Increase
QueryConcurrencyQuotaLimit
per recent AWS update to CloudWatch Logs Service Quotas Limits for query concurrency. (issue #19) - Fix example code that wasn't using millisecond granularity (leftover from fixing issue #16)
- Improve documentation about chunks (issue #20)
Millisecond granularity
API changes:
- None.†
Big Features:
- None.
Improvements:
- Allowed
Start
/End
timestamps andChunk
andSplitUntil
durations to have millisecond granularity.- Before this release they were not allowed to have sub-second granularity.
- This is a backwardly-compatible change (no valid usage will break) but makes the API more powerful for log groups that have enormous amounts of messages, such as might be produced from a highly-trafficked web service that handles more than 10,000 transactions per second.†
- Fixed a bug caused by bugs in the underlying CWL Insights API documentation and implementation which sometimes caused log events to be missed. (Issue #16)
- Fixed a dynamic chunk splitting bug in which Incite would sometimes do one extra split causing chunks smaller than
SplitUntil
. (Commit 7a80c011) - Small improvements to the documentation.
†: While the API hasn't strictly changed, and is backwardly-compatible with v1.3.1, it is now slightly more powerful.
Maintenance release (bug and documentation fixes)
API changes:
- None.
Big Features:
- None.
Improvements:
- Fixed a pair of related regressions introduced in the v1.3.0 concurrency refactor relating to retryable temporary errors on chunk start and chunk poll:
- Query manager had stopped respecting the
Config.Parallel
limit. This was tending to cause unnecessaryLimitExceededException: Account maximum query concurrency limit of [10] reached
errors. These were being retried up to the temporary errors limit so sometimes they did not affect the outcome of a query, but sometimes they revealed the second issue...: - Issue #15. In some cases, if a chunk's temporary start or polling error limit was exceeded, this would cause the chunk to basically be "forgotten" without its termination being recorded in the stream bookkeeping structures. This would eventually result in someone blocking on a stream read forever. The fix is to ensure the stream errors out if any chunk fails due to exceeding its temporary retry max.
- Query manager had stopped respecting the
- Corrected a logic error in the dynamic splitting documentation.
Better query performance through higher concurrency
API changes:
Stats
structure now includesRangeMaxed
to indicate how much of the query time range had maxed out chunks, meaning CloudWatch Logs Insights returnedLimit
results for that chunk. If this field has a non-zero value it indicates that at least some of the chunks in the query time range likely had more results available than were returned. Typically if dynamic chunk splitting is enabled, this will always be zero. [Issue #4 ]
Big Features:
- Performance is improved significantly by improving concurrency. Chunks are now started and polled in independent goroutines, meaning that chunk starting, which previously had a higher priority, can't block polling of completed chunks. [Issue #9]
Improvements:
- Retry of transient network errors is improved to handle some rare cases. [Issue #13]
- Make further small improvements to the documentation.
- Fix some unit testing concurrency edge cases that failed with extremely low probability like 1 in 20K iterations.
- Fix some goroutine leaks caused by a few unit tests that failed to clean up after themselves, which made debugging painfully difficult in some cases due to large numbers of zombie goroutines.
Dynamic chunk splitting and progress stats
API changes:
QuerySpec
structure adds newSplitUntil
field for requesting dynamic chunk splitting. [Issue #3]Stats
structure adds four new stats fields for reporting query progress data both within a single query (Stream
) and within a group of queries (QueryManager
). [Issue #1]
Big Features:
- With dynamic chunk splitting, when CloudWatch Logs Insights returns the maximum possible number of results,
MaxLimit
(an indicator that more results may be available within the chunk), users can ask the query manager to split the chunk time range into smaller time ranges and retry them in order to capture all possible results. [Issue #3] - With granular progress statistics in the
Stats
structure, developers now have the data they need to create progress bars and user experiences to inform their application's users how much of the requested query work has been finished and how much remains. [Issue #1]
Improvements:
- Transient network errors are now retried. [Issue #10 ]
- Some small documentation issues are addressed.
Full Changelog: v1.1.0...v1.2.0
Increased resiliency and better logging
API changes:
- None
Improvements;
- Queries that have transient failures on the server are automatically retried. Issue#5
- Fuzzy JSON unmarshalling into
map[string]interface{}
tries to unmarshal Insights time fields@timestamp
and@ingestionTime
intotime.Time
if possible. Issue#7 - Log messages about chunks now contain the query chunk ID. Issue#6
- Where possible the CWL query ID for the chunks is logged, as long as it has been assigned from a successful
StartQuery
call. - Always, an internal chunk ID is logged. The chunk ID is basically the chunk index within the stream, plus a suffix to indicate how many times it has been retried.
- Where possible the CWL query ID for the chunks is logged, as long as it has been assigned from a successful
- Finishing of a chunk is now logged consistently. Issue#8
Initial release, stable API
At long last, we have a stable API and the first major version!
API changes:
- Remove
Hint
fromQuerySpec
to simplify the library interface. - Change
Unmarshal
behavior so it keeps trying on a best effort basis after encountering an error.
Other changes:
- Log more events and improve existing log messages.
- Improve documentation and add useful
README.md
. - Correct minor slips in copyright documentation.
Preview release with Unmarshaling bug fixes
Bug fixes
- Fix a bug where
Unmarshal
would fail to unmarshal into an array (not slice) if the array was big enough to hold the entire unmarshaleddata
. - Fix a bug where fuzzy JSON decoding did not work correctly on the JSON literals
null
,false
, andtrue
.
Other improvements
- Increase test coverage for
Unmarshal
. - Minor housekeeping/tidying.