Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/correct behavior on trailling cr #65

Merged
merged 2 commits into from
Jan 24, 2024

Conversation

AlexAxthelm
Copy link
Collaborator

I made a mistake in my review on #60, since \r should not be recognized as a trailing newline (no modern system writes files with that as the trailing character). Both the unix LF and windows CRLF line endings end with \n.

Trailing return (CR, ) should not be recognized as a trailing newline
change variable name to avoid conflict with builtin
@AlexAxthelm AlexAxthelm requested a review from cjyetman January 23, 2024 16:03
Copy link

Coverage Report
file head main diff
Overall 58% 58% 0%
R/canonize_path.R 91% 91% 0%
R/char_to_.R 0% 0% 0%
R/determine_headers.R 72% 72% 0%
R/get_csv_specs.R 0% 0% 0%
R/guess_delimiter.R 100% 100% 0%
R/guess_file_encoding.R 89% 89% 0%
R/guess_numerical_mark.R 94% 94% 0%
R/has_binary_null.R 100% 100% 0%
R/has_consistent_fields_per_line.R 100% 100% 0%
R/has_newline_at_end.R 100% 100% 0%
R/is_file_accessible.R 100% 100% 0%
R/is_readable_file.R 100% 100% 0%
R/is_text_file.R 100% 100% 0%
R/is_valid_currency_code.R 100% 100% 0%
R/is_valid_cusip.R 100% 100% 0%
R/is_valid_isin.R 100% 100% 0%
R/read_portfolio_csv.R 86% 86% 0%
R/simplify_if_one_col_df.R 100% 100% 0%

Copy link
Member

@cjyetman cjyetman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like checking for ending with "\r" was done in response to a real world example I had, but I can't remember specifically, and I just tested 482 sample files I used way back when and didn't find any like that so... I'll somewhat hesitantly accept this change in behavior (I can't think of any substantial improvement this adds other than maybe aligning with current common standards, whereas this repo's main purpose is to identify deviations from any current standard and fix them).

@AlexAxthelm AlexAxthelm added this pull request to the merge queue Jan 24, 2024
@cjyetman cjyetman removed this pull request from the merge queue due to a manual request Jan 24, 2024
@cjyetman cjyetman merged commit d4cb150 into main Jan 24, 2024
9 checks passed
@cjyetman cjyetman deleted the fix/correct-behavior-on-trailling-cr branch January 24, 2024 10:01
@jdhoffa
Copy link
Member

jdhoffa commented Jan 24, 2024

I feel like checking for ending with "\r" was done in response to a real world example I had, but I can't remember specifically, and I just tested 482 sample files I used way back when and didn't find any like that so... I'll somewhat hesitantly accept this change in behavior (I can't think of any substantial improvement this adds other than maybe aligning with current common standards, whereas this repo's main purpose is to identify deviations from any current standard and fix them).

NB: Consider translating as many attributes as possible from those 482 datasets as possible into unit tests, so we don't need to rely on manual checks. Attributes being quirks specific to those files

@cjyetman
Copy link
Member

NB: Consider translating as many attributes as possible from those 482 datasets as possible into unit tests, so we don't need to rely on manual checks. Attributes being quirks specific to those files

This is for-the-most-part done already in the many unit tests in this repo that do exactly that. I consider this repo the most critical in terms of allowing us to adapt to many of the weird things we've encountered in the wild, but also shielding us from strange or incorrectly imported portfolio data that plagued us for so long.

I was searching for a real world case of something that there was a unit test for, but was removed in this PR.

@jdhoffa
Copy link
Member

jdhoffa commented Jan 24, 2024

Awesome :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants