-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge develop to integration 0.9.28 #1876
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…b#1831) * misc(core): Adding unit tests for histograms for StitchRvsExec
* fix bugs caused by stitching empty and non-empty data. The current Transient row can only handle a schema (Long, Double). However, there are different schemas for histogram, Avg and so on. Create a NaNRowReader to handle any schemas. --------- Co-authored-by: Yu Zhang <[email protected]>
…rved field (filodb#1842) Addition of _type_ field to index is now configurable for each cluster, false by default for now. Also, if the part-key already has the _type_ field we don't index that since it is a reserved field that we populate.
…ilodb#1819)" (filodb#1838) * Revert "feat(core): Now metadata queries support _type_ filter (filodb#1819)" This reverts commit 8ce88de.
This reverts commit 8108083.
…er (filodb#1819)" cherry-pick and revert a hotfix commit from main to prevent conflicts during downstream merges
…nants instead of workspace (filodb#1849)
) New behavior : This change adds support for the Tantivy indexing library as an alternative to Lucene for time series indexing. In several cases it has been found that this is superior to Lucene performance, especially when it comes to memory usage and predictability of memory spikes. This feature is opt-in via a configuration setting to avoid any unexpected changes during upgrade. For the moment only the raw time series index is supported. Downsample support may come in a future PR. BREAKING CHANGES This change requires a working Rust & C compiler to build given the Tantivy code is written in Rust. README docs have been updated to reflect this. There are no runtime breaking changes.
… aggregation metric if applicable based on the given tags (filodb#1844)
Add a qualifier to imports to support slightly older Rust versions. Tag cargo metadata with min tested version to give a better error.
Co-authored-by: Kier Petrov <[email protected]>
Rust names the x86-64 architecture as x86_64. Java names it as amd64. This mismatch causes errors during library load as they can't agree on the file path. The fix is to normalize the Rust name into the Java name, so it can locate the output binaries.
If you're targetting an older Linux distro the default glibc version being linked against may be too high to produce runnable images. cargo zigbuild supports specifying a specific glibc version to use for a link target by appending ".<version>" to the target triple. For example, "x86_64-unknown-linux-gnu.2.17" will target v2.17. For the most part this just works but we need to strip this suffix when looking for output binaries. This fix adds that logic.
Co-authored-by: Kier Petrov <[email protected]>
… segments (filodb#1864) Columns in the column cache hold a reference to the mmaped file data that backs the segment. These segments can be deleted during segment merging, but if a column for that segment is in the cache it prevents the mmap from closing and releasing RAM. To fix this we subscribe for notifications on segment list changes and clear the column cache when these occur so stale segments can be reclaimed.
…1855) * Fixed mismatched schema regarding fixedVectorLen. * Do not compare against colIds on schema match. --------- Co-authored-by: Yu Zhang <[email protected]>
…l aggregated metric (filodb#1863) * misc(query): increment counter when query plan updated with next level aggregated metric * Adding unit test to test if metric is being incremented as expected
) indexValues was falling way behind Lucene due to a few reasons: 1. We were copying results directly into Java objects, which was incurring a lot of JNI back and forth overhead 2. When querying the entire index we were looking at docs instead of the reverse index, which increased the count of items to process This PR does a few things: 1. Add perf benchmarks for the missing functions 2. Add a new IndexCollector trait that can be used to walk the index vs docs 3. Remove the JNI object usage in indexValues vs byte serialized data 4. Glue all these optimizations togther. With this Tantivy is still a bit behind Lucene for this path, but it's almost 100x faster than before.
…Experience logicalPlan update (filodb#1869) * Supporting multiple agg rules for a single promql query. Example query for which the hierarchical logical plan updated will be now supported: sum(metric1:::suffix1{}) + sum(metric2:::suffix2{})
When bootstrapping the raw index we skip over tracking items with invalid schemas, signified by partId = -1. However, today we still index them which can create query errors later on like the following: ``` java.lang.IllegalStateException: This shouldn't happen since every document should have a partIdDv at filodb.core.memstore.PartIdCollector.collect(PartKeyLuceneIndex.scala:963) at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:305) at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:247) at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:38) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:776) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:551) at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1(PartKeyLuceneIndex.scala:635) at filodb.core.memstore.PartKeyLuceneIndex.$anonfun$searchFromFilters$1$adapted(PartKeyLuceneIndex.scala:635) at filodb.core.memstore.PartKeyLuceneIndex.withNewSearcher(PartKeyLuceneIndex.scala:279) at filodb.core.memstore.PartKeyLuceneIndex.searchFromFilters(PartKeyLuceneIndex.scala:635) at filodb.core.memstore.PartKeyLuceneIndex.partIdsFromFilters(PartKeyLuceneIndex.scala:591) at filodb.core.memstore.TimeSeriesShard.labelValuesWithFilters(TimeSeriesShard.scala:1782) ``` This fix ensures that we don't index part keys we skip during bootstrap so that the in memory shard and index are consistent with each other.
… data. (filodb#1868) Co-authored-by: Yu Zhang <[email protected]>
…erience (filodb#1873) * fix(query): removing max/min aggregations from hierarchical query experience
kvpetrov
approved these changes
Nov 1, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved if integ tests pass.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge develop to integration 0.9.28