-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retagging Joyce’s dialogue #19
Comments
Having a to-do list for this seems wise. FYI: The following ack -o "(?<=<said who=\")[\w\'\. ]*" *.xml This will compile a sorted list of all the names across the corpus: ack -ho "(?<=<said who=\")[\w\'\. ]*" *.xml | sort | uniq I was using it as a sanity check to catch misspellings when I marked up "Telemachus." Could we also assign, or let people claim, episodes to mark up with dialogue on this, or another issue? I am going to tackle another episode as soon as I can, and want to avoid reduplicating labor. |
Good idea. Can we formally assign them or do we just call dibs here? After you started Those ack commands will come in very handy once we start figuring out the speaking parts. |
Going to do the |
Going to tackle |
@c-forster, that |
I’m simplifying this. A |
Sounds good. I'm not seeing the key in persons.txt, though? Anyway when it's there, if it's in some kind of regular format, like comma- or tab-separated, then it'll be easy to make a list of these keys to add to the header. |
I’m doing it all offline while I go through all eighteen episodes. I’ll merge them all into the repository once done. My local persons.txt looks like this: db [tab]Davy Byrne That could be the basis for a |
Awesome, sounds great.
Ronan Crowley <[email protected]> writes:
… I’m doing it all offline while I go through all eighteen episodes. I’ll merge them all into the repository once done.
My local persons.txt looks like this:
db Davy Byrne
dbc Davy Byrne's curate
dbm D.B. Murphy
dd Dan Dawson
did Dilly Dedalus
That could be the basis for a `<listPerson>` – information I’d love to see added but too much for us right now (I feel).
|
Some content that was marooned in the closed #9 was your suggestion, Jonathan, for unclear <lb n="060004"/><said xml:id="060004-a" who="Cunningham">―Come on, Simon.
<certainty target="#060004-a" match="@who" locus="value" assertedValue="Power" degree="0.5">
<desc>It's unclear here whether it's Cunningham or Power speaking.</desc>
</certainty>
</said> I’m going to go ahead and use this encoding whenever an unclear speaker is limited to a handful of candidates. Unless you’ve another idea? |
persons.txt contains a list of all speakers in the novel.
Sounds great. Let's do it. I'll make a note of this in our conventions
list, too.
Ronan Crowley <[email protected]> writes:
… Some content that was marooned in the closed #9 was your suggestion, Jonathan,
for unclear @who values. Something like:
<lb n="060004"/><said xml:id="060004-a" who="Cunningham">―Come on, Simon.
<certainty target="#060004-a" ***@***.***" locus="value" assertedValue="Power" degree="0.5">
<desc>It's unclear here whether it's Cunningham or Power speaking.</desc>
</certainty>
</said>
I’m going to go ahead and use this encoding whenever an unclear speaker is
limited to a handful of candidates. Unless you’ve another idea?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.*
|
How do we attribute dialogue in an exchange between several people ? There’s a spot like this in Hades where no speakers are given for several lines of dialogue: <lb n="060114"/><said who="lb">―I met M'Coy this morning,</said> Mr Bloom said. <said who="lb">He said he'd try to come.</said></p>
<p><lb n="060115"/>The carriage halted short.
<lb n="060116"/><said who="unclear">―What's wrong?</said>
<lb n="060117"/><said who="unclear">―We're stopped.</said>
<lb n="060118"/><said who="unclear">―Where are we?</said></p>
<p><lb n="060119"/>Mr Bloom put his head out of the window.
<lb n="060120"/><said who="lb">―The grand canal,</said> he said.</p> The unclears can only be Cunningham, Power or Simon Dedalus (with Bloom, perhaps, chiming in at U 6.117). How best would that be encoded? |
I read the TEI docs on <lb n="060114"/><said who="lb">―I met M'Coy this morning,</said> Mr Bloom said. <said who="lb">He said he'd try to come.</said></p>
<p><lb n="060115"/>The carriage halted short.
<lb n="060116"/><said who="unclear">―What's wrong?
<certainty match="@who" locus="value" assertedValue="Power" degree="0.33" />
<certainty match="@who" locus="value" assertedValue="Cunningham" degree="0.33" />
<certainty match="@who" locus="value" assertedValue="Simon Dedalus" degree="0.33" />
</said>
<lb n="060117"/><said who="unclear">―We're stopped.
<certainty match="@who" locus="value" assertedValue="Power" degree="0.33" />
<certainty match="@who" locus="value" assertedValue="Cunningham" degree="0.33" />
<certainty match="@who" locus="value" assertedValue="Simon Dedalus" degree="0.33" />
</said>
<lb n="060118"/><said who="unclear">―Where are we?
<certainty match="@who" locus="value" assertedValue="Power" degree="0.33" />
<certainty match="@who" locus="value" assertedValue="Cunningham" degree="0.33" />
<certainty match="@who" locus="value" assertedValue="Simon Dedalus" degree="0.33" />
</said></p>
<p><lb n="060119"/>Mr Bloom put his head out of the window.
<lb n="060120"/><said who="lb">―The grand canal,</said> he said.</p> ...which is super kludgey and not very DRY. Ideally we could do @tcatapano, any ideas? |
This is a to-do issue to pick out the various tasks discussed in #9:
Convert all double-hyphen dialogue dashes to the quotation dash or horizontal bar.
Shift the
</said>
tags in<said>―</said>
structures to the end of character speech. Add all intermedial<said>
tagging.Proof the
</said>
tagging for every episode. How? We will visualize all of the episodes in a browser and colour just the</said>
tagged dialogue.Episodes remaining: 1. “Telemachus” 2. “Nestor” 3. “Proteus” 4. “Calypso” 5. “Lotus Eaters” 6. “Hades” 7. “Aeolus” 8. “Lestrygonians” 9. “Scylla and Charybdis” 10. “Wandering Rocks” 11. “Sirens” 12. “Cyclops” 13. “Nausicaa” 14. “Oxen of the Sun” 15. “Circe” 16. “Eumaeus” 17. “Ithaca” 18. “Penelope”
Disambiguate the appropriate
<emph>
to<said>
tagging.[there might be a few other stragglers]
Add
@who
attribution for every instance of<said>
(or in “Circe”<sp>
). Use character names for the values.Switch
@who
values to@xml:id
.Compile a
<listPerson>
dossier of speakers.The text was updated successfully, but these errors were encountered: