-
Notifications
You must be signed in to change notification settings - Fork 39
37 CFR 1 - notice amendment generates unexpected instruction label #369
Comments
Hey @gregoryfoster, I just submitted #370 to fix the explosion here -- @meriouma was definitely right. A quick note for context: even though you're parsing part 1, we use a single structure to represent notices (which may include amendments to multiple parts.) This is leading to a seemingly unrelated amendment messing up your parse -- sorry that that's not obvious! Second, regarding the specific amendment: it's adding a "heading" in front of two sections, which sounds similar to a subpart or a subject group, but doesn't match any concept we have; we have headings within sections, but not between them. In any event, something to keep an eye on if you're planning to parse part 2 later. |
Thanks for your quick fix, @cmc333333. There's an identical problem for interpretations which I've fixed in PR #371, though please see my comment about my accidental inclusion of your commit from #370. Unfortunately, I'm continuing to have trouble with this particular amendment for the reason you anticipated in your comment: headings between sections. Aside, this notice (68 FR 48286) repeats that same instruction pattern in amendment 20. @meriouma identified this error and coded a workaround here. However, I've begun to feel like we're kicking the can down the road as the last three errors in 37 CFR 1 are caused by this particular notice amendment pattern. After applying PRs #370 & #371, the parser stops on ipdb> print amended_labels
[(POST, ['2']), (POST, ['2', '200']), (POST, ['2', '201'])]
ipdb> print section
None
ipdb> print notice_changes
<regparser.notice.changes.NoticeChanges object at 0x10a74add0>
ipdb> print content
Content(struct=None, amends=[(POST, ['2']), (POST, ['2', '200']), (POST, ['2', '201'])]) My interpretation is the parser isn't sure where to place this first instruction (the title between sections). What's your read, and where do you think we should go from here? |
Hey @gregoryfoster, sorry for the delay! As a quick reproduction recipe, we can run eregs clear
eregs preprocess_notice 03-20489
eregs write_to somewhere You'll see the error in the last step; when we added the notice-and-comment feature, we began including the list of changes in the output JSON for each notice. Unfortunately, this increased the chance that misparses of an unrelated regulation part might cause yours to explode. I think the most immediate path forward is to wrap that step in a try-catch (with a warning); this field isn't used unless we're displaying a notice during the N&C period. Accounting for the specific error's a bit more tricky as that header is causing two issues. When we try to piece together what changes amendments are making, we effectively take two passes; the header causes an issue with each. In the first phase (a preprocessing plugin), the AMDParser looks at the amendment instructions (
Combined with contextual information those tokens lead to instructions to create a new heading for all of part 2 and create new sections 2.206 and 2.209. This is clearly not what we wanted -- not only should the heading not apply to the whole of 37 CFR 2, we're not accounting for the "through", which would imply the creation of 2.207 and 2.209 as well. So, we'd want to change the AMDParser's logic (and potentially grammar) to account for a situation like this. I'll note here that the preprocessing step will skip any amendments which already have The second phase involves finding XML for specific instructions (i.e. the "contents" of the change rather than the instruction). The error you're seeing indicates that the parser's not finding the appropriate XML for the 2, 2.206, nor 2.209 instructions. As noted above, the first is understandable (and should probably be fixed in the AMDParsing stage) but the latter two are likely a bug with the logic that determines which XML section to grab. It looks like this logic wouldn't find the sections due to that extra header. That should be a pretty small change, I'd think. Hope all the context helps! Let us know if we can help. |
#369: adding try/except to amendments getter
Dev environment: current Master [ 89535d0 ] +
fr-notices
PRs 7 & 8.Running
eregs --debug fill_with_rules 37 1
results in a failure when processing 68 FR 48286 amendment 19 atregparser/notice/amendments/appendix.py:16
. This failure was previously encountered by @meriouma and a workaround implemented here.Here's the code which fails when trying to access an invalid index in
label_parts
.ipdb
gives the following context:instruction_xml
:<POST label="2[title]"/>
len(label_parts)
: 1etree.tostring(amdpar)
:I am not familiar enough with the FR Notice parsing code to recognize what this might be, any ideas?
The text was updated successfully, but these errors were encountered: