Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Latest monty release doesn't find regrep matches on last line #741

Closed
2 of 4 tasks
kavanase opened this issue Jan 17, 2025 · 4 comments
Closed
2 of 4 tasks
Labels

Comments

@kavanase
Copy link

Email (Optional)

No response

Version

v2025.1.9

Which OS(es) are you using?

  • MacOS
  • Windows
  • Linux

What happened?

In the latest release of monty, regrep fails to match text which is on the last line of the file:
Image

Works fine with previous release:
Image

Noticed by automated tests with ShakeNBreak

Code snippet

from monty.re import regrep

with open("test_file", "w+") as f:
    f.write("some_text\n"*70 + "final line")


regrep(filename="test_file",
       patterns={"match": "final"},
       reverse=True,
       terminate_on_match=True,
       )["match"]

Log output

Code of Conduct

  • I agree to follow this project's Code of Conduct
@kavanase kavanase added the bug label Jan 17, 2025
@kavanase
Copy link
Author

Works fine when reverse=False, so it's something related to changes in reverse_readfile:

Image

I think due to changes in #730?

@DanielYang59
Copy link
Contributor

DanielYang59 commented Jan 18, 2025

Hi @kavanase happy new year and thanks for reporting this.

However I believe this is expected behaviour (introduced in #712) to treat the last matched line ending character (\n in this case) as the marker of last line.

Reason being: according to the POSIX standard definition of "line":

3.185 Line
A sequence of zero or more non- characters plus a terminating character.

In other words, a "correctly formatted" (agreeing with POSIX standard) text file should have every line ended with a line ending char including the last line (i.e. it should end with last line\n instead of last line).

@kavanase
Copy link
Author

Hi @DanielYang59 , happy new year!

Ah ok, thanks for linking the PR. I guess it could be an issue with incorrectly-formatted outputs, but I see that was discussed in #712 already.
Thanks for the quick reply!

@DanielYang59
Copy link
Contributor

No worries at all.

I guess it could be an issue with incorrectly-formatted outputs, but I see that was discussed in #712 already.

Perhaps you could take my answer with a pinch of salt as I'm not 100% sure if "all text files should follow the POSIX standard" or it's just the "recommended way". Meanwhile we noticed some output file of LOBSTER may also have some similar behaviour

Free feel to comment if you have any findings :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants