Skip to content

Commit

Permalink
Embedding Projector: fix projector knn computation (#6269)
Browse files Browse the repository at this point in the history
## Motivation for features / changes

Fix a bug with knn computation in projector

## Technical description of changes

If we have 1000 points, and we sample 100 points, we cannot reuse the
old knn computation because it could contain points that are not part of
the sample.

## Screenshots of UI changes

N/A

## Detailed steps to verify changes work correctly (as executed by you)
1. Build and launch
[projector](https://github.com/tensorflow/tensorboard/blob/bbc9e4f29a55d48478c3f23a7d80221b5b1b1e3c/tensorboard/plugins/projector/README.md)
2. Use default demo tensor (Word2Vec 10K)
3. Change projection type from PCA to T-SNE. This should compute 10k
sample points with 90 neighbors for knn.
4. Change projection type from T-SNE to UMAP. You'll see a "Initializing
UMAP..." screen loading indefinitely. UMAP uses 5k sample points and 15
neighbors for knn by default

Verify the above step results in successful UMAP rendering after the
changes are applied

## Alternate designs / implementations considered
  • Loading branch information
alicialics authored Apr 26, 2023
1 parent c0e7447 commit b72d751
Showing 1 changed file with 3 additions and 9 deletions.
12 changes: 3 additions & 9 deletions tensorboard/plugins/projector/vz_projector/data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ export interface DataPoint {
};
}
const IS_FIREFOX = navigator.userAgent.toLowerCase().indexOf('firefox') >= 0;
/** Controls whether nearest neighbors computation is done on the GPU or CPU. */
/** Maximum sample size for each projection type. */
export const TSNE_SAMPLE_SIZE = 10000;
export const UMAP_SAMPLE_SIZE = 5000;
export const PCA_SAMPLE_SIZE = 50000;
Expand Down Expand Up @@ -459,20 +459,14 @@ export class DataSet {
this.nearest && this.nearest.length ? this.nearest[0].length : 0;
if (
this.nearest != null &&
this.nearest.length >= data.length &&
this.nearest.length === data.length &&
previouslyComputedNNeighbors >= nNeighbors
) {
return Promise.resolve(
this.nearest
// `this.points` is only set and constructor and `data` is subset of
// it. If `nearest` is calculated with N = 1000 sampled points before
// and we are asked to calculate KNN ofN = 50, pretend like we
// recalculated the KNN for N = 50 by taking first 50 of result from
// N = 1000.
.slice(0, data.length)
// NearestEntry has list of K-nearest vector indices at given index.
// Hence, if we already precomputed K = 100 before and later seek
// K-10, we just have ot take the first ten.
// K = 10, we just have ot take the first ten.
.map((neighbors) => neighbors.slice(0, nNeighbors))
);
} else {
Expand Down

0 comments on commit b72d751

Please sign in to comment.