Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the code points of ≪⃒ and add a test case (#67) #68

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Sources/HTMLEntities/Constants.swift
Original file line number Diff line number Diff line change
Expand Up @@ -454,7 +454,7 @@ let namedCharactersDecodeMap2: [String: Character] = [
"nlarr;":"\u{219A}","nldr;":"\u{2025}","nlE;":"\u{2266}\u{338}","nle;":"\u{2270}",
"nLeftarrow;":"\u{21CD}","nleftarrow;":"\u{219A}","nLeftrightarrow;":"\u{21CE}","nleftrightarrow;":"\u{21AE}",
"nleq;":"\u{2270}","nleqq;":"\u{2266}\u{338}","nleqslant;":"\u{2A7D}\u{338}","nles;":"\u{2A7D}\u{338}",
"nless;":"\u{226E}","nLl;":"\u{22D8}\u{338}","nlsim;":"\u{2274}","nLt;":"\u{226A}\u{338}",
"nless;":"\u{226E}","nLl;":"\u{22D8}\u{338}","nlsim;":"\u{2274}","nLt;":"\u{226A}\u{20D2}",
"nlt;":"\u{226E}","nltri;":"\u{22EA}","nltrie;":"\u{22EC}","nLtv;":"\u{226A}\u{338}",
"nmid;":"\u{2224}","NoBreak;":"\u{2060}","NonBreakingSpace;":"\u{A0}","Nopf;":"\u{2115}",
"nopf;":"\u{1D55F}","Not;":"\u{2AEC}","not;":"\u{AC}","NotCongruent;":"\u{2262}",
Expand Down
53 changes: 52 additions & 1 deletion Tests/HTMLEntitiesTests/HTMLEntitiesTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,10 @@
import XCTest
@testable import HTMLEntities

#if canImport(FoundationNetworking)
import FoundationNetworking
#endif

let replacementCharacterAsString = "\u{FFFD}"

/// HTML snippet
Expand Down Expand Up @@ -398,6 +402,52 @@ class HTMLEntitiesTests: XCTestCase {
XCTAssertEqual(try text.htmlUnescape(strict: false), "한")
}

func testDecodeMaps() throws {
struct CodePointsAndCharacters: Codable {
var codepoints: [UInt32]
var characters: String
}

var entitiesData: Data?
var dataTaskError: Error?
let expectation = self.expectation(description: "Downloading entities.json")

let url = URL(string: "https://html.spec.whatwg.org/entities.json")!
URLSession.shared.dataTask(with: url) { data, _, error in
if let error = error {
dataTaskError = error
} else if let data = data {
entitiesData = data
}
expectation.fulfill()
}.resume()

self.wait(for: [expectation], timeout: 60)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice touch adding a test against a known reference. I'm not sure it's a good idea to hit the server on every unit test, though. There could be other problems (network issues, unexpected changes upstream etc.). Perhaps it'd be better to keep this in the repo?

At 140k raw, it's not "huge". Assuming it's not something that changes often, it's probably fine to keep in the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing.

Your concerns are reasonable. I also thought about it at first. However, I realized that I would have to verify that the file was correct. It seems to me that using a local file to validate *DecodeMaps simply postpones the problem, since another way is needed to validate the file itself.

Moreover, since the HTML specs keep changing, there will come a time when the file will have to be changed. We must then verify in some way that those changes are correct.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually added reason to keep a copy of the file in the repo so that it can be manually inspected each time. If it's automated, then the test can break at any time and it'll be unclear why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for replying. The main point I wrote was that it didn't make sense to add a file to the repo because I thought verifying it wasn't much different than verifying Swift constants. But it's a copy of a file published as a part of the specs, so it is easier to verify than Swift constants. I changed my mind. I'm going to add the file and rewrite the test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the commit b86aa37.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dannys42 I am not in a hurry, but what do you think?


if let dataTaskError = dataTaskError {
throw dataTaskError
}

guard let data = entitiesData else {
XCTFail("Failed to download entities.json")
return
}

let dict = try JSONDecoder().decode([String: CodePointsAndCharacters].self, from: data)

for (k, v) in specialNamedCharactersDecodeMap {
XCTAssertEqual(dict["&\(k)"]!.codepoints, v.unicodeScalars.map(\.value), k)
}

for (k, v) in legacyNamedCharactersDecodeMap {
XCTAssertEqual(dict["&\(k)"]!.codepoints, v.unicodeScalars.map(\.value), k)
}

for (k, v) in namedCharactersDecodeMap {
XCTAssertEqual(dict["&\(k)"]!.codepoints, v.unicodeScalars.map(\.value), k)
}
}

static var allTests : [(String, (HTMLEntitiesTests) -> () throws -> Void)] {
return [
("testNamedCharacterReferences", testNamedCharacterReferences),
Expand All @@ -406,7 +456,8 @@ class HTMLEntitiesTests: XCTestCase {
("testDecode", testDecode),
("testInvertibility", testInvertibility),
("testEdgeCases", testEdgeCases),
("testREADMEExamples", testREADMEExamples)
("testREADMEExamples", testREADMEExamples),
("testDecodeMaps", testDecodeMaps)
]
}
}