Skip to content
This repository has been archived by the owner on Oct 28, 2022. It is now read-only.

Remove created and updated timestamps from API #839

Open
andrewjesaitis opened this issue Mar 6, 2017 · 4 comments
Open

Remove created and updated timestamps from API #839

andrewjesaitis opened this issue Mar 6, 2017 · 4 comments

Comments

@andrewjesaitis
Copy link
Member

There has been a lot of discussion surrounding timestamps both on the Reference Server Task Team and within Github issues (#568, #682, #685, #690).

The key takeaway is that there is a lot of confusion surrounding what these fields represent, they lack clearly defined use cases, and have no consistent representation across the API. As discussed in other threads, do these fields represent the time the sample was taken, the time the first sequencing run was done, the time when the downstream bioinformatic pipeline was run, the time when the data was added to the database, the file created time, the possible timestamp in the file? The issue is only further muddied when a sample has undergone multiple analysis runs or the application backend has no concept of created/updated (as is the case with the file-backed reference server) or when comparing timestamps from two different API implementations. These limitations lead me to believe these fields are implementation specific and thus should not be included in the schema definitions.

I propose the removal of created and updated fields from all messages. If there is a well defined need for a timestamp on a message than it can be added with documentation clearly stating what the timestamp represents and in a format consistent with the API (ISO8601). I think this approach of including timestamps only where necessary will reduce confusion, increase consistency, and ease future support.

@kozbo
Copy link
Contributor

kozbo commented Mar 9, 2017

@mbaudis

@mbaudis
Copy link
Member

mbaudis commented Apr 12, 2017

@andrewjesaitis From previous discussions and the documentation it is actually clear that created and updated refer to record specific dates; there is no indication that they have to do anything with record-external time points (like sample collection or such):

  // The :ref:`ISO 8601<metadata_date_time>` time at which this Biosample record
  // was created.
  string created = 6;

  // The :ref:`ISO 8601<metadata_date_time>` time at which this Biosample record was
  // updated.
  string updated = 7;

Also, all the values are (or should be) ISO 8601, which also has been settled before.

We know from many use case discussions and personal experience that such information (i.e. beyond server time stamps etc.) is of practical value. How these attributes are used locally (i.e. if they are linked to audit-grade time stamping) is irrelevant for the API; also, this cannot be solved on a global scale.

There are 2 points which I seem worthwhile for discussion:

  • the awkward naming; logically it should be record_creation_time && record_modification_time, but this is/can be made clear enough in the documentation
  • the need for a created attribute, since modification is probably more relevant; alas, there are always use cases, and the attribute is optional (at least on the schema level)

So my take: keep/implement throughout the schema (not necessary everywhere - a variant or call don't need them, since they would be implemented in the variantset, callset etc.). Obviously ISO; everything else is a violation of the schema docs. I don't mind about name changes (created & updated were actually decided against before, but somehow they stuck then around).

@andrewjesaitis
Copy link
Member Author

I agree that at a minimum they should be renamed record_created_time and record_modification_time and use ISO8601.

I'll let @kozbo, @david4096 and @ejacox chime in with their thoughts as well.

@mbaudis
Copy link
Member

mbaudis commented Apr 14, 2017

@andrewjesaitis No problem with me, since this is what I wrote back in #568 (comment)

@jeromekelleher While I'm fine with created | updated, this was seen as ambiguous in var. discussions ("updated" could be a Boolean/flag) & so @diekhans switched it to the current. I had started the whole ISO8601 switch & documentation, but very explicitly called those recordCreateTime | recordUpdateTime, originally. ... But as said, all fine with me, since the documentation should make it clear.

So I'll leave the clean-up to whoever feels responsible.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants