-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(input): Add EDTF as an option for date representation #284
Conversation
Adopting EDTF sounds like a good plan to me -- much more compact, still readable, allows ranges (pandoc currently handles these by using an array of the year/month/day objects). |
So we could define a regex of at least some set of it we support?
I found this, BTW:
https://github.com/richardtallent/EdtfDotNet/blob/master/Edtf/EdtfRegexPattern.txt
|
We should just adopt EDTF. It has defined levels of features, right? For citations, we need date ranges, uncertain dates, approximate dates, CE/BCE, seasons, and dates to specified precision (e.g., 194x). I think that covers it. What level of EDTF had that? @retorquere you’ve explored this in BBT right? @fbennett you’ve also explored accommodating Japanese imperial dates in CSLm. How does that work? Could we incorporate that and extend to Arabic calendar dates? |
I just pushed a commit that removes the date-part pattern and replaces it with a issued:
edtf: 2004-02-01? Note:
FWIW, EDTF has been around for awhile, and I participated in the discussions largely to ensure it would work for our use case. It's been a few years since I have participated, but it seems a) they finished, and b) it does. |
7603cd1
to
d6daada
Compare
Emiliano has done a lot of work on date parsing, so I think we can draw on that. |
d6daada
to
20a86fd
Compare
I think we'll need base + parts of 1 and 2. So we'd chose the "Level 0 is supported, and in addition the following features of levels 1 are supported (list features)" option under "compliance," and so itemize what features we use beyond base. |
EDTF: yes please! Would be a massive simplification. I don't see what could speak against this... |
This could also be useful for the proposed date range condition: #251 |
20a86fd
to
cb76bd8
Compare
I’m all for using the EDTF date format, but there’s one thing I am aware of that can be expressed in the current CSL JSON and CSL YAML date format, but, regrettably, has not been included in EDTF, and that’s season ranges, e.g., edtf.js does support these (“Seasons in intervals are supported at the experimental/non-standard level 3.“), and I feel CSL should declare support for EDTF dates extended by season ranges in similar terms. See also this discussion on the edtf.js issue tracker. biblatex, BTW, supports the |
Thanks much @njbart; very helpful! I agree, this is what we should do. I revised the first post to include a TODO, of which this is part of the first one. |
My date parser works in stages (one of which is EDTF, done by edtf.js), but that's mostly because I try to make sense of dates that I should hope will never turn up in CSL:
You'll understand this is slow as molasses, but it makes sense of a lot of stuff I've been offered in test cases over the years. |
This adds EDTF, part of ISO 8601-2, as an alernative to the current, more verbose, representation.
cb76bd8
to
69b4e49
Compare
OK, all, after some back-and-forth with @bwiernik and @dhimmel, I revised the PR to give two choices:
I also added 2 to 1, with a new TBD still:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't tested the new JSON schema, but the changes look great @bdarcus (if they achieve the intended effect. Defining edtf-datatype
separately was a smart move.
I did test it, and it works as expected. |
I forgot about season ranges, but just added it.
OK. To be clear, the current code allows one representation, or the other. But you could indeed do this: issued:
edtf: 1950
season: Trinity On the longer term question, beyond status of date-parts, would be whether the above would be allowed too; e.g. whether it'd just be the EDTF string. PS - wouldn't this be an EDTF representation of 13th century: |
We do, e.g., for multivolume works where not all planned volumes have appeared yet (e.g., |
Yes, and newer biblatex versions do support this as EDTF input – but I doubt whether even biblatex can actually render this as “13th century” (haven’t tested though). |
I think at this point, I just want the regex to reject dates that have characters for features we don't want/need to support. The remaining questions are sets (I agree, we can add), and approximate ( |
Maybe we should just do as biblatex does? See it's manual: section: 2.3.8 Date and Time Specifications, which explicitly discusses it's support for ISO 8601-2. FWIW, it supports both approximate (circa) and uncertain dates. Maybe we support both in this PR, and then finalize the decision on the documentation? |
I'm all in favour. CSL could still support literal and season elements (biblatex recommends its |
What about this though: |
Probably not. I can’t remember having encountered anything like this, neither in the wild, nor in CMOS or APA. Theoretically, such a feature might make sense if you wanted to highlight a particularly unusual publication history (“Doe (1901, 1902, 1910, 1969), My Autobiography in Four Volumes.“), but, again, I’ve never actually seen this anywhere. |
OK. I'm going to remove support for the brackets and comma then. We can revisit later. |
Just one question, I guess. Will the |
I’d support all three “flags” in EDTF input data, i.e., |
To go back to your question, @fbennett, EDTF does not have support for literal or raw content in the current date representation. But it's a superset of date-parts + circa + season. |
From the above, I’m not sure where we stand on open ended date ranges like |
Yes, open ended ranges are needed for journals or multi-volume works that are not complete yet. |
Nothing here would preclude that. Just a topic for documentation. I think this PR is RTM. Would it be possible for @bwiernik, @denismaier, and @njbart to collaborate on a documentation PR sometime in the near future? As I suggested, I think biblatex (which claims support for level 0 and 1) would be the place to start, given a) it's well-designed in general, and b) they already integrated and implemented this support. |
PS - the other advantage of following biblatex as closely as possible is compatibility. |
I missed this discussion, but I am personally not a fan of serializing information that could be represented differently. Converting CSL-JSON with EDTF to BibTeX or RIS now requires parsing EDTF, which is potentially pretty slow, versus an array access. I'd rather improve the current format, I think something like below could encode at least the same as EDTF. {
accessed: [{ date: [2020, 7, 1] }],
issued: [{ date: [1960], season: 'Spring' }],
'original-date': [{ literal: '13th century' }],
'event-date': [
{ date: [1960, 3, 2] },
{ date: [1960, 3, 5] }
]
} (You could even remove the outer array for non-range dates) However, I do see the use with YAML and I get it if this is considered a very minor concern. |
Except biblatex does support EDTF, so no parsing needed there.
It could be we enhance the structured option so it supports the exact same
features that we support in EDTF, for cases where that's a better option.
|
I meant specifically BibTeX, not BibLaTeX, which has |
That’s basically adding the pandoc fields as an option, but providing for all of those options would mean requiring processors to support every format. Personally I think BibLaTeX is a more important thing to be compatible with; supporting both requires EDTF anyway. |
…language#284) As part of citation-style-language#278, and to harmonize the JSON and YAML representations around a much more concise and expressive date format, this adds a an option to use EDTF; either as a preferred string on any date, or as an "edtf" string property on the more verbose alternative object representation. While EDTF was originally an initiative of the US Library of Congress, ISO adopted it as part of 8601-2 in 2019. Note: The current regular expression pattern only checks for valid characters.
As part of #278, and to harmonize the JSON and YAML representations around a much more concise and expressive date format, this adds a an option to use EDTF; either as a preferred string on any date, or as an "edtf" string property on the more verbose alternative object representation. While EDTF was originally an initiative of the US Library of Congress, ISO adopted it as part of 8601-2 in 2019. Note: The current regular expression pattern only checks for valid characters.
Description
As part of #278, and to harmonize the JSON and YAML representations around a much more concise and expressive date format, this adds an option to use EDTF.
While EDTF was originally an initiative of the US Library of Congress, it seems that in 2019, ISO adopted it as part of 8601-2.
Example of a date-range with a circa qualifier in the new alternative (using the YAML syntax):
TODO
date-parts
; do we say deprecated now and remove later, for example, or recommend EDTF?Compliance Details
We need to say "Level 0 is supported, and in addition the following features of levels 1 and 2 are supported (list features)."
Note: biblatex formally supports levels 0 and 1.
Level 0
2010-03-10
)2010-03-10/2010-03-11
)Note: level 0 includes date-times, which I don't think we need.
Level 1
2010-21
)-0500
)I'm confident in the above; the rest below are uncertainties; I think I'd prefer to not support, on input at least.
Uncertain vs Approximate Dates
2010-03-04?
means "uncertain";~
indicates "approximate" and%
both)I remember this discussion during EDTF development, but never understood the practical distinction.
In our case, do we support one or the other, or both?
How?
I vote for only one, which we treat as equivalent to our "circa." But no pun intended, I'm not 100% sure.
Other Issues
I'm not sure we need this:
2010-03-XX
)We need these "Extended Interval " features on styles, but on input?
1900/..
)../1900
)1900/
)/1900
)Level 2
Do we need "all member" sets? (
{2010-03-04,2010-03-05}
)Extension
2020-21/2020-22
).EDTF libraries