-
-
Notifications
You must be signed in to change notification settings - Fork 818
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix[lang]: fix encoding of string literals (#3091)
this commit fixes bad runtime encoding of unicode strings. `parse_Str` used the utility function `string_to_bytes`, which rejects characters with value larger than 255 and otherwise produces the ascii encoding of the string. the issue is that bytes in the range 128-255 specify different characters in utf-8 than in ascii encodings, resulting in different values at runtime than at compile-time. this can be seen from differing compile-vs-runtime behavior of `keccak256` (this example was provided in GH issue 3088): ```vyper @external @view def compile_hash() -> bytes32: return keccak256("è") @external @view def runtime_hash() -> bytes32: s: String[1] = "è" return keccak256(s) ``` this commit fixes and simplifies `parse_Str` by using python's `str.encode()` builtin, which encodes using utf-8 by default. it also increases strictness of string validation to reject bytes in the range 128-255, since in utf-8 these can encode multibyte characters, which we reject in vyper (see more discussion in GH issue 2338).
- Loading branch information
1 parent
9b5523e
commit 43259f8
Showing
4 changed files
with
43 additions
and
49 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters