Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Lists of mutations in metadata file #2459

Open
theosanderson opened this issue Aug 19, 2024 · 6 comments
Open

feat: Lists of mutations in metadata file #2459

theosanderson opened this issue Aug 19, 2024 · 6 comments
Labels
silo/LAPIS upstream silo/LAPIS issue

Comments

@theosanderson
Copy link
Member

theosanderson commented Aug 19, 2024

While thinking about casing-discussions I realised that we don't support something that is quite popular in metadata files: lists of mutations (nucleotide and amino acid) from reference as a "metadata" column. This is pretty useful in the early days of an outbreak.

In terms of implementation, I'm not sure if this is supported in LAPIS and if not, if there is appetite for it. If not it could be another argument for some sort of proxy one day. Obviously all of this is very post MVP.

@theosanderson theosanderson changed the title Lists of mutations in metadata file feat: Lists of mutations in metadata file Aug 19, 2024
@chaoran-chen
Copy link
Member

LAPIS does not support the inclusion of mutations on a per-sample basis but if we want this, wouldn't it be very easy to add the two corresponding fields to the metadata and let the preprocessing pipeline generate the values?

@theosanderson
Copy link
Member Author

Yes good point - it would be a kind of weird situation because we'd be getting the metadata to display on the sequence details page from one source (LAPIS's mutation store) and the metadata in the bulk download from another (LAPIS's metadata store) and I guess would increase memory usage but definitely a possible option

@theosanderson
Copy link
Member Author

(Either the preprocessing pipeline or the export for SILO)

@chaoran-chen
Copy link
Member

Good point. Maybe we should indeed consider adding a corresponding feature to LAPIS. Would you like to create an issue as it's your idea? :)

@chaoran-chen
Copy link
Member

chaoran-chen commented Aug 19, 2024

How would you handle multiple segments or genes? Should it only have one column with mutations on all segments or would it be useful to be able to only get data for selected segments? If the latter, what would a good API look like?

@theosanderson
Copy link
Member Author

Thanks for considering. I made GenSpectrum/LAPIS#896 for starters. Sorry, not much thoughts on any API yet - anyone feel free to chime in!

@theosanderson theosanderson added the silo/LAPIS upstream silo/LAPIS issue label Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
silo/LAPIS upstream silo/LAPIS issue
Projects
Status: No status
Development

No branches or pull requests

2 participants