Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate unnecessary escape characters #4176

Merged
merged 1 commit into from
Aug 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 19 additions & 4 deletions doc/templates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -522,10 +522,25 @@ exist:
* ``"Hello World"``: Everything between two double or single quotes is a
string. They are useful whenever you need a string in the template (for
example as arguments to function calls, filters or just to extend or include
a template). A string can contain a delimiter if it is preceded by a
backslash (``\``) -- like in ``'It\'s good'``. If the string contains a
backslash (e.g. ``'c:\Program Files'``) escape it by doubling it
(e.g. ``'c:\\Program Files'``).
a template).

Note that certain characters require escaping:
* ``\f``: Form feed
* ``\n``: New line
* ``\r``: Carriage return
* ``\t``: Horizontal tab
* ``\v``: Vertical tab
* ``\x``: Hexadecimal escape sequence
* ``\0`` to ``\377``: Octal escape sequences representing characters
* ``\``: Backslash

When using single-quoted strings, the single quote character (``'``) needs to be escaped with a backslash (``\'``).
ruudk marked this conversation as resolved.
Show resolved Hide resolved
When using double-quoted strings, the double quote character (``"``) needs to be escaped with a backslash (``\"``).
ruudk marked this conversation as resolved.
Show resolved Hide resolved

For example, a single quoted string can contain a delimiter if it is preceded by a
backslash (``\``) -- like in ``'It\'s good'``. If the string contains a
backslash (e.g. ``'c:\Program Files'``) escape it by doubling it
(e.g. ``'c:\\Program Files'``).

* ``42`` / ``42.23``: Integers and floating point numbers are created by
writing the number down. If a dot is present the number is a float,
Expand Down
79 changes: 77 additions & 2 deletions src/Lexer.php
Original file line number Diff line number Diff line change
Expand Up @@ -367,7 +367,7 @@ private function lexExpression(): void
}
// strings
elseif (preg_match(self::REGEX_STRING, $this->code, $match, 0, $this->cursor)) {
$this->pushToken(/* Token::STRING_TYPE */ 7, stripcslashes(substr($match[0], 1, -1)));
$this->pushToken(/* Token::STRING_TYPE */ 7, $this->stripcslashes(substr($match[0], 1, -1), substr($match[0], 0, 1)));
$this->moveCursor($match[0]);
}
// opening double quoted string
Expand All @@ -382,6 +382,81 @@ private function lexExpression(): void
}
}

private function stripcslashes(string $str, string $quoteType): string
{
if (!str_contains($str, '\\')) {
return $str;
}

$result = '';
$length = strlen($str);
$specialChars = [
'f' => "\f",
'n' => "\n",
'r' => "\r",
't' => "\t",
'v' => "\v",
];

for ($i = 0; $i < $length; $i++) {
if ($str[$i] !== '\\' || $i + 1 >= $length) {
$result .= $str[$i];

continue;
ruudk marked this conversation as resolved.
Show resolved Hide resolved
}

$nextChar = $str[$i + 1];
if (isset($specialChars[$nextChar])) {
$result .= $specialChars[$nextChar];
$i++;
} elseif ($nextChar === '\\') {
$result .= '\\';
$i++;
} elseif ($nextChar === "'" || $nextChar === '"') {
if ($nextChar !== $quoteType) {
trigger_deprecation('twig/twig', '3.12', 'Character "%s" at position %d does not need to be escaped anymore.', $nextChar, $i + 2);
}
$result .= $nextChar;
$i++;
} elseif ($nextChar === '#' && $i + 2 < $length && $str[$i + 2] === '{') {
$result .= '#{';
$i += 2;
} elseif ($nextChar === 'x' && $i + 2 < $length && ctype_xdigit($str[$i + 2])) {
$hex = $str[$i + 2];
if ($i + 3 < $length && ctype_xdigit($str[$i + 3])) {
$hex .= $str[$i + 3];
$i++;
}
$result .= chr(hexdec($hex));
$i += 2;
} elseif (ctype_digit($nextChar) && $nextChar < '8') {
$octal = $nextChar;
for ($j = $i + 1; $j <= 3; $j++) {
$o = $j + 1;
if ($o >= $length) {
break;
}
if (!ctype_digit($str[$o])) {
break;
}
if ($str[$o] >= '8') {
break;
}
$octal .= $str[$o];
$i++;
}
$result .= chr(octdec($octal));
$i++;
} else {
trigger_deprecation('twig/twig', '3.12', 'Character "%s" at position %d does not need to be escaped anymore.', $nextChar, $i + 2);
$result .= $nextChar;
$i++;
}
}

return $result;
ruudk marked this conversation as resolved.
Show resolved Hide resolved
}

private function lexRawData(): void
{
if (!preg_match($this->regexes['lex_raw_data'], $this->code, $match, \PREG_OFFSET_CAPTURE, $this->cursor)) {
Expand Down Expand Up @@ -423,7 +498,7 @@ private function lexString(): void
$this->moveCursor($match[0]);
$this->pushState(self::STATE_INTERPOLATION);
} elseif (preg_match(self::REGEX_DQ_STRING_PART, $this->code, $match, 0, $this->cursor) && '' !== $match[0]) {
$this->pushToken(/* Token::STRING_TYPE */ 7, stripcslashes($match[0]));
$this->pushToken(/* Token::STRING_TYPE */ 7, $this->stripcslashes($match[0], '"'));
$this->moveCursor($match[0]);
} elseif (preg_match(self::REGEX_DQ_STRING_DELIM, $this->code, $match, 0, $this->cursor)) {
[$expect, $lineno] = array_pop($this->brackets);
Expand Down
93 changes: 81 additions & 12 deletions tests/LexerTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
*/

use PHPUnit\Framework\TestCase;
use Symfony\Bridge\PhpUnit\ExpectDeprecationTrait;
use Twig\Environment;
use Twig\Error\SyntaxError;
use Twig\Lexer;
Expand All @@ -21,6 +22,8 @@

class LexerTest extends TestCase
{
use ExpectDeprecationTrait;

public function testNameLabelForTag()
{
$template = '{% § %}';
Expand Down Expand Up @@ -178,23 +181,89 @@ public function testBigNumbers()
$this->assertEquals('922337203685477580700', $node->getValue());
}

public function testStringWithEscapedDelimiter()
/**
* @dataProvider getStringWithEscapedDelimiter
*/
public function testStringWithEscapedDelimiter(string $template, string $expected)
{
$lexer = new Lexer(new Environment(new ArrayLoader()));
$stream = $lexer->tokenize(new Source($template, 'index'));
$stream->expect(Token::VAR_START_TYPE);
$token = $stream->expect(Token::STRING_TYPE);
$this->assertSame($expected, $token->getValue());
}

public function getStringWithEscapedDelimiter()
{
$tests = [
"{{ 'foo \' bar' }}" => 'foo \' bar',
'{{ "foo \" bar" }}' => 'foo " bar',
yield '{{ \'\x6\' }} => \x6' => [
'{{ \'\x6\' }}',
"\x6",
];
yield '{{ \'\065\x64\' }} => \065\x64' => [
'{{ \'\065\x64\' }}',
"\065\x64",
];
yield '{{ \'App\\\\Test\' }} => App\Test' => [
'{{ \'App\\\\Test\' }}',
'App\\Test',
];
yield '{{ "App\#{var}" }} => App#{var}' => [
'{{ "App\#{var}" }}',
'App#{var}',
];
yield '{{ \'foo \\\' bar\' }} => foo \' bar' => [
'{{ \'foo \\\' bar\' }}',
'foo \' bar',
];
yield '{{ "foo \" bar" }} => foo " bar' => [
'{{ "foo \\" bar" }}',
'foo " bar',
];
yield '{{ \'\f\n\r\t\v\' }} => \f\n\r\t\v' => [
'{{ \'\\f\\n\\r\\t\\v\' }}',
"\f\n\r\t\v",
];
yield '{{ \'\\\\f\\\\n\\\\r\\\\t\\\\v\' }} => \\f\\n\\r\\t\\v' => [
'{{ \'\\\\f\\\\n\\\\r\\\\t\\\\v\' }}',
'\\f\\n\\r\\t\\v',
];
}

/**
* @group legacy
* @dataProvider getStringWithEscapedDelimiterProducingDeprecation
*/
public function testStringWithEscapedDelimiterProducingDeprecation(string $template, string $expected, string $expectedDeprecation)
{
$this->expectDeprecation($expectedDeprecation);

$lexer = new Lexer(new Environment(new ArrayLoader()));
foreach ($tests as $template => $expected) {
$stream = $lexer->tokenize(new Source($template, 'index'));
$stream->expect(Token::VAR_START_TYPE);
$stream->expect(Token::STRING_TYPE, $expected);
$stream = $lexer->tokenize(new Source($template, 'index'));
$stream->expect(Token::VAR_START_TYPE);
$stream->expect(Token::STRING_TYPE, $expected);

// add a dummy assertion here to satisfy PHPUnit, the only thing we want to test is that the code above
// can be executed without throwing any exceptions
$this->addToAssertionCount(1);
}
// add a dummy assertion here to satisfy PHPUnit, the only thing we want to test is that the code above
// can be executed without throwing any exceptions
$this->addToAssertionCount(1);
}

public function getStringWithEscapedDelimiterProducingDeprecation()
{
yield '{{ \'App\Test\' }} => AppTest' => [
'{{ \'App\\Test\' }}',
'AppTest',
"Since twig/twig 3.12: Character \"T\" at position 5 does not need to be escaped anymore.",
];
yield '{{ "foo \\\' bar" }} => foo \' bar' => [
'{{ "foo \\\' bar" }}',
'foo \' bar',
"Since twig/twig 3.12: Character \"'\" at position 6 does not need to be escaped anymore.",
];
yield '{{ \'foo \" bar\' }} => foo " bar' => [
'{{ \'foo \\" bar\' }}',
'foo " bar',
'Since twig/twig 3.12: Character """ at position 6 does not need to be escaped anymore.',
];
}

public function testStringWithInterpolation()
Expand Down