-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathChangeLog.old
352 lines (352 loc) · 23.4 KB
/
ChangeLog.old
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
678d45d (HEAD -> master, origin/master, origin/HEAD) Treat all tests as XFAIL_TESTS if zhfst support is disabled.
a3f39d9 Merge pull request #31 from hfst/configure-explicit-error
bc1c9ea Separate counters (fixes #37, closes #38)
a4aeae0 Post-case-mangling unique test
1d088e1 Avoid shadowing multicharacter ascii-beginning symbols
69ab95d Merge pull request #35 from unhammer/renamespace-hfst_ol
d39b93c rename hfst_ol namespace to hfst_ospell to avoid conflicts
eccbd6f Bump actual version; Modernize for() loops
a9919f4 Merge pull request #33 from unhammer/analysisSymbols
f36de02 bump minor version because of new API function
019f303 New API fn analyseSymbols giving unconcatenated vector<string>'s
da508b7 Dirty hack to parse XML a bit better; ToDo: use actual portable XML library
dad7c5c Minimal XML parsing to get locale, title, and description for Voikko and other frontends
f40d4b5 (origin/configure-explicit-error) let './configure --enable-zhfst' fail if no libarchive found
a737b52 When using help2man remember -n description
1fc3fca Update copyright and authors listing
d791a41 (tag: v0.4.5) Release 0.4.5.
4f2026c Add missing files to dist.
bae21c3 Forgot to commit configure.ac... Update hfst-ospell manual and add man pages to dist.
7c8690a (tag: v0.4.4) Ready for release 0.4.4.
671efe6 Trusty has C++11
32842b1 Disable nightly repo
2a4e556 Move more files
0ab57c9 Remove HFST dependency - fixes #26; Move tests to folder
687b3c1 The --with-extract arg now sets the default - the other option is still tried if default fails; Oddly enough, this fixes issue #26 for no good reason - it's happily extracting to memory now, when it wasn't before.
f061774 Apply the notimestamp patch in trunk
3322fd2 Extra early checks for archive extraction failures
0cc1a4a Use std::runtime_exception
14b1b2c Rename stdafx.h and install it
4569e9f Windows DLL fixes
88aa8dd Allow building without XML support
139790f (tag: v0.4.3) Ready for release 0.4.3.
9562377 Fixes for big endian conversions; Make use of AC_C_BIGENDIAN and WORDS_BIGENDIAN; UCHAR_MAX should be +1 to allow hitting index 255
bf79d25 Add returns
8573403 Merge branch 'big_endian_compatibility_suggestion' In hopes that this approach to fixing build problems on the problematic architectures will get tested for Debian builds. This is experimental, but at least shouldn't break anything - feel free to revert if unhelpful.
b756b2e Killed an error. Don't know whether it was the correct solution - please review.
3ca7756 max version for tinyxml
662f79c (origin/big_endian_compatibility_suggestion) Forgot to actually assign header length to a variable when reading raw bytes
abed65f A suggestion for big-endian compatibility
43f5dc6 (tag: v0.4.2) Release 0.4.2.
e9cc1da Prevent segfaults from no-errormodel.sh test This also makes the test succeed rather than XLFAIL, so removed it from XFAIL_TESTS (why not?)
6d359c0 Remove unused $(HFST_FLAGS) variable that may be leading to some test failures
b31fec1 markdown
9fa49a2 not supporting default autostuff like promised...
c53090d Merge branch 'master' of github.com:hfst/hfst-ospell
4b07bf5 experimental ci
f582bac (tag: v0.4.1) Release 0.4.1.
29649b1 Set gitignores.
31e6842 Bump patch version since revision has decreased
d2fb71e set_time_cutoff(6.0)
08bc7a9 When upping to 16 variants, also make room for all 16 variants
080042a Added section header, correct tool name to man page.
a95e4aa 0.4.0
2565275 Release 0.4.0.
3e31ef1 Update help message, copyright extends to 2016.
6ca58bc Make no-errormodel.sh an xfail test.
047e9d9 Only call clock() every millionth time for speed
ae8a660 Fix time cutoff handling
b222053 Update man pages
96e17f7 Update man page
d92e8a7 If the input token had any upper case, also try a first-upper variant of it
5fee947 Special case nuvviDspeller
b5946d9 Bugfix: class -> struct (libvoikko complained under OSX, and err’ed out).
929f926 If we are looking for suggestions, don't use the cache
04e4843 Fix a couple of typos in argument list and verbose prints. Add a missing statement that actually sets the time cutoff for the speller.
cf8ee5c Fix a typo in wide string conversion function.
30940be Time cutoff option during correction for library and demo app
77bed9b Apply Brendan Molloy's alignment patch for ARM CPUs & reindent to namespace=0
da6fcc5 Propagate first-upper and all-upper to suggestions
b57d248 Add option --enable-hfst-ospell-office to configure (defaults to yes). Add support for using libarchive2 if libarchive3 is not available.
26a4e37 Add --verbatim to disable the variations
ece3455 Don't use UTF-8 length for UTF-16 buffer
3b07200 I meant 20480
df80db9 Use ICU to determine what's alphanumeric - should fix https://github.com/TinoDidriksen/spellers/issues/1
bc4b3a2 If pkg-config fails, fall back on icu-config - blame Ubuntu 12.04
3266e7b Silly mistake
5226fb2 Use ICU instead
be5f73a Use wide chars during case folding; Set locale
90d7fcb Fix typo that broke UNKNOWN handling the the error source
0ff3b88 Fix bug in novel symbol tokenization
791ca3e Bounds check table accesses Now that new symbols from input can be added to the lexicon's alphabet, it's possible that during checking we try to access beyond the lexicon's table boundaries, so that needs to be prevented.
0865615 CentOS 6 fixes: Move AC_INIT up to prevent _m4_divert_diversion error; Use older ax_check_compile_flag.m4
833e6f6 autoconf-archive doesn't exist on RPM platforms, so bundle ax_check_compile_flag.m4
e59a06e Determine and use highest supported C++ standard
f8d3893 Reinstate legacy function Transducer::lookup() due to Voikko's use of it
b645b28 Unknown and identity arcs can now appear both in the lexicon and error model This is a rather large overhaul of everything having to do with symbols, and causes an approximately 8% slowdown at this time. Some more efforts will be made to bring that down..
28cd37b Make weight checking methods const
d76526a Correct nbest handling for cached results too + remove extranous comparison class
a10455e Special-case limit checking for nbest
568d606 Correct limit setting
22bd0b5 Remove debugging print
ed57354 Use a list to emulate a sorted double-ended weight queue
effe39d Removed keyword 'typename', it caused GCC with C++03 to bark.
57891b8 Drop down to boring ol' C++03
524f259 Fix declaration order and unicode issues
1feb23e Add --default-weight, --outputfile, pair specification syntax for regex output
d40e733 Add a --regex mode to produce a large regular expression with weighted replace rules
612aa05 Add a first-symbol cache and do some reorg
6a94a9b Fix -Wall -Wextra issues
2fcd06e New protocol: Let client choose max suggestions, input must now match "^(\d+) (.+)$"
dcb9474 Search up the normalization list until something gives suggestions
ad14157 Disable log2 max-dist and instead set hard limit due to FST weights
e7cfbc6 Tokenization is done by the host, so don't second guess
e484ef1 Add hfst-ospell-office helper so the algorithm is in one place, instead of in every frontend
14c72ca Arguably clearer scheme for combining limiting options
0c5eeec Reinstate legacy spelling, slightly adding to but not altering interface
4d1fa2b Add option --beam=W for restricting the search to a margin above the optimum. Any combination of n-best, weight limit and beam is possible, all restrict each other.
29180f3 Use vector instead of deque now that breadth search isn't used for anything
ea15127 Be a bit more systematic about search-limiting behaviour, also fail earlier, this gives a small speed improvement
a9d9e55 Make argument parsing further agree with documentation, reinstating -X
e2b9b7e nbest wasn't producing any runtime benefit due to incorrect comparison
94064b5 Reinstate the : lost from the -n arg in getopt_long()
b164f2b Change max_weight to max-weight in the help string and argument name
48d6d15 Check that maxweight is positively set before using it, default is negative
a782763 Version 0.3.1 - this introduces API/linking changes Support limiting the correction search to a maximum weight. The hfst-ospell command supports this via --max_weight / -w.
76d9052 Make -n (max n corrections) work again
4d6f7dd Have separate switches for each XML mangling library
d39d419 Updated hfst-ospell man page.
1f38ba3 Just error out when missing XML
d7058e8 Allow selecting real-word errors
8866483 Missing stdarg reqd in >gcc 4.8 or somesuch
561760e Added configure and Makefile files for Windows compilation.
f19070d A tiny performance improvement (about 5%, this is not what was discussed on irc, but every little helps)
a674152 Fix missing newline in verbose message
ba22ebb This is how I got the analysing speller to work
f884489 Simple example of analysis spelling logic
4019232 can analyse if speller or sugger
f621a37 Fix bug #239 When correcting, keep track of mutator's output / lexicon's input; when analysing, keep track of lexicon's output
bfd40d9 Fix bug #238 Now that alphabets are translated even if the error source contains symbols not found in the lexicon, the lexicon needs to be prepared to be asked whether there are transitions with NO_SYMBOL
f7b6dcc documentation front page
0a17be0 Dist changes, 0.3.0 API stable though pending other fixes for .1
03d40cc Some beginnings for a documented stable API.
a8dc221 This has been missing all along?
7f70384 Allow setting max suggestions from command line tool
edc36e7 Remove legacy, allow setting suggestion limits
a61dcd2 Tentative support for analyse-as-you-error-correct
8847e56 Errmodel with out of lexicon alphabet fails
f702e28 Better test case for analyser
c96a2e8 Add tests for analysing speller and missing errmodels
104a24d Analysis interface and implementation for two-tape language models. The suggestion engine still erroneously gives the analysis side of suggestion LM though.
75fc585 Now printing and reading to/from console works in hfst-ospell is it is compiled with -DWINDOWS.
94031ca Added a rule for man page creation in Makefile.am.
e293f96 Added a test for an empty zhfst file. It should fail, but presently hfst-ospell behaves as if all is ok, just that it isn't able to correct anything.
add8ddd Corrected return value to what it should be for a SKIP result, which I assume was intended.
09ea12d Added files to the distro.
de621bd Added test to check whether error models with extra chars are correctly handled. The test passes with revision 3793 of ospell.cc, and fails with revision 3729.
4860369 Silently ignore non-lexicon symbols in the error source, translating them to NO_SYMBOL on the output side.
9e8ae05 revert
8e74395 What happens if I do this?
41bb8cf Moving implementations?
9629091 Install men
5a0fbdb A man page for debian's lint
f4850b0 Probably libarchive deprecation will go away with different function name here too
f76b1fa Move some of the code away from headers
db0c56d some old changes on work desktop?
16511d0 Ensure other potentially empty tags for libxmlpp
832f9d0 Bumped version for tinyxml2 check to get lengthful Parse by Børre Gaup
201acd8 unarchive to mem and tinyxml2 xml length fixes from Harri Pitkänen
220e6cb Reverted the patch by Børre to clean the xml off of random byte noise after the end of the xml document. The issue was caused by zip compression incompatibilities between different plattforms, and the solution (for now) is to just skip compression altogether, using 'zip -Z store' instead when building the zhfst files. This creates a zip file readable everywhere (but without compression).
c57f0bd Test for trailing spaces in xml structure
ad60602 Patch by Børre Gaup to clean corrupted xml data returned from Libarchive when reading xml data from a zhfst file stored in RAM:
b0ba291 Whitespace changes to make the file easier to my eyes, comments. Added tests for empty locale and empty title elements.
42da63d Data and shell scripts to test empty titles and empty locale.
6b5145f Check for empty locale node.
0838caa Clean and ignore index.xml.
26e6b99 One more test for empty data.
2b1089f Added newline before error cause.
a440da7 Empty descriptions will throw (there might be others left)
58047b3 index.xml with empty descriptions
2f93bcc Test.strings shall not be deleted when cleaning
c4adeb0 Added check for the availability of pkg-config - without it configuration will fail in subtle but bad ways.
cd24885 Document for next release candidate
3365813 Missing model attribute parsing for errm
5ca88f6 Use Elements instead of Nodes and other such fixes
bd8eada Use automake conditionals to avoid pkg-config linking to libraries that are not in configure's code paths
0322db3 update configuration for that
2153668 Add tinyxml2 versions of XML parsing
658ff68 conference demos as configure option
b2ff3fb stats -> status
e20d008 Allow to end correct correction
c16d604 Merge lookup() from devbranch
0761df3 This should be useful
fb1c6c1 Fix the output tsv a bit more
299b94d Slight udpate for new measure ments
44f6276 Few starts
f25c609 Allow selection of xml backend
4be4b96 Switch <3 into >3 because it looks nicer.
5d80f3f strndup() fixed by Tommi.
e11a913 Pending proper understanding of why strndup is getting defined several times over, let's always use hfst_strndup() instead.
248d05d Add include guard to custom strndup so it doesn't get compiled more than once
af41cbf Move our custom strndup to ospell.h so it will be seen by all compilation units that need it
bd813b0 Renamed the hfstospell package and variables to what they used to be, in case that can help solve a build issue with libvoikko+hfst.
83b1a09 Wrap and load
f03ef00 versioninfo 3:1:1
d70a9e6 Preparing for 0.2.3 release.
c8ba084 Changed version number to 0.2.3, and renamed the package and tool name to 'hfst-ospell', since that is what the command-line tool is actually called, and it is consistent with the rest of the hfst tool names.
3d40399 Should fix #176 Flag state from the stack back was getting clobbered in some situations. Now we restore flag state after modifying it for new nodes.
ac10ddd Fix dist stuff, wrap.
3fa2b5d Use pkg-config instead of autoconf to check libarchive
7e422b7 Prepare files for 0.2.2
89a4a99 Fix few leaks
44b313c Fixes to [#165] by Harri Pitkänen
3ad822b Made some changes to depth-first searching to prevent just-added nodes getting removed. This appears fo fix a very long-standing and serious bug dating from r2763.
36fa55c When data is in memory, allocate & copy it in ospell rather than expect caller to keep track of it (which it usually doesn't)
8196d8a Add access to metadata
1da34c4 When initialised from memory, don't assume responsibility for freeing it
01a43fd Somewhat experimental (no checking) way of saving memory by ignoring alignment issues (4-5 x memory savings)
db95a32 Remove leftover agenda variable
2ee0efc Add stuff for survey article
a439eaf Revert to breadth-first searching pending bugfix
8166a16 Search depth-first by preference for hope
876ebbf Forgot some important checks that we actually want to limit the results
b1b6b24 Enforce an n-best limit for continuing the search, just for breadth-first for now
78f2345 Don’t just break on empty lines
ab3f591 Switch order of errmodels and acceptors in legacy read
5baaaed set spelling caps on legacy read
7c60d3e Test fallback
a1b1e35 The VPATH approach to the test shell scripts was wrong. Now everything is working as it should.
fcef585 Enabling VPATH building for 'make check'.
f8d7a54 Whitespace change only in configure.ac. More generated files to ignore.
b6f6482 Added some whitespace to ease readability. Replaced pattern rules with suffix rules to make automake happy.
3027315 Fix compilation without xml, as suggested by Harri Pitkänen on libvoikko-devel
9e2503d do not close unopened files, handle legacy fallback reading errors
9e3739d print details of xml parsing errors
5791425 Add the very basic test suite
e39e40b Throw exception if reading a zip without existing zip file.
3ebb0aa Optionalise the libraries
4ab1a3a Update for automake-1.12 and AM_PROG_AR
99f80d4 kill everyone if windows linebreaks
9459c17 Fix documentation and set for 0.2.1 release
ab90ec9 free more things and stuff
bca66ba Move Xml metadata parsing and storing into own file and class
686d305 Increment *raw when reading bools
5e91e84 Added utility function for iterating c strings in raw memory, use it in every branch of symbol reading
246c2b2 Fixed problems having to do with reading strings from a transducer in raw memory.
15b3184 fix the tmpdir'd version again
bc2d9af Version that extracts zhfst to memory iff it fits on one throw
69ff631 Added raw memory constructors and table readers for slurping in transducers from char *
4772d28 Missing files lol
77c23cf ...and in that case the final identity states also are greater by one.
58e9512 When avoiding initial edits, there needs to be an extra inital state before we do any edits. So add the value of options.no_initial to the state range loop.
433e917 Commented out hfst-ospell-cicling from Makefile.
73f8871 Comment out missing cicling target
b6a3317 "else if" instead of incorrect "if" in option handling
620392f --no-string-initial should have transitions to state 1, not 0
9f8cf3c Don't exit on empty lines
2ab0594 Don't print unhelpful warnings
4fb22d1 Silly default for minimum edit
2e58cd8 fallback_spell()
ca9a3b0 --no-string-initial-correction
76c87fe Corrected eliminations (hopefully), added --minimum-edit option
713aff2 Profiled version for fsmnlp-2012 measurements§
999ca00 Don't skip state numbers when certain options are turned off (and don't print debugging lines)
e864859 Corrections to redundancy elimination
49b4394 redundancy elimination was being performed one state too late (too little)
bb93e12 Add option to disable redundancy elimination
731db86 Initial support for redundancy elimination (refuse to do insertion after deletion or deletion after insertion)
83f85ce Bugfix: identity transitions were being forgotten in the last edit state
9ca92d4 Enforce @_UNKNOWN_SYMBOL_@ instead of @?@, which users didn't know about
2b001e6 Fixed one remaining UTF-8 stderr printing bug.
aee7a2a Fixed bug with the newline character not being stripped from excluded symbols
92ab53a Lines that start with ## are comments
fd5f560 Updated help message & added exclusion of symbols by prepending a ~
45442ab Wrap stderr with a utf-8 codec so we can print non-ascii symbols when verbose
5c27eca Write alphabet with weights when verbose
f9b426f Order of preference of alphabet definition is now configfile - commandline - transducer. If configfile gives a weight after a tab for symbols, they are used additively for all edits involving those symbols.
6629f88 Some clarifying comments
33e3c95 Rescue identities from being considered substitutions
4ce7bb0 utf-8 -decode the user-specified transitions from a conf file (so easy to forget one of these...)
42ec180 use same includedir in pc as makefile
40afbca don't declare strndup in public headers
b5a6c7a make ol exceptions in hfst_ol namespace, provide stdexception style what()
3a58fbb Add verbose, quiet
c821e19 * parse all of the metadata if possible * use c++ ``struct''s for metadata
3f5bb8b * Use temporary filenames by tmpsnam * Do not delete Transducers in data structures since it will segfault all enchant-based applications in dtor
5449a28 libhfst-style exception macros and some more informative messages
e9c551d Document dependencies
105b6e2 mac os x fixes: * strndup * libarchive installed without headers
a2b4ea5 Preliminary zhfst support
d7bf3e8 Example from: http://norvig.com/spell-correct.html and similar tests
a2b9d15 The test for final weighted transitions involves target_index == 1 instead of weight == INFINITE_WEIGHT
3cb5d8a Fixed bug involving bytewise casting of longs to floats (I misunderstood what static_cast really does I guess).
7ff6a13 fread returns element count, not byte count
9c01f47 duplicate definitions
a433823 fix msvc problems
4f2c186 check fread return value as advised by gcc
b324b05 Removed unnecessary test
9f801d0 Understand hfst3 headers, don't demand weightedness at header-reading stage
693eaea use hfstospell library name for compatibility or whatever
c0a2985 MSVC fixes: * include <string> when using strinG * use boolean operators instead of aliases?
4f16033 add pkgconfig stuff
5ebbfec autoconfiscate :-)
37ad031 Add licences everywhere for release
4e7c098 make directories that do not exist
5c450f7 Install to destdir
a93f59d Ignore compiled libraries.
7e0a5b4 fix missing dash in mac dylib magic
3bd799d Silently ignore if empty labels are missing
b6bc407 Make dynamic or share libraries
3767ad9 Added some fancy rule autodetection
5b87174 Fixes to input format handling
9a7418c New input file syntax
fd3eb19 More speed improvements
a78427f Various optimizations
dff1eab Critical bugfix, output now believed to be correct
c227d79 Diagnostics and info about expected transducer model in test script
196ff02 More helpful help message for test script
1d1acfd Should be executable.
c50bb9b Support for OTHER symbol in test script
44e6e9c Added profiler flag to debug compilation target and made demo utility exit on empty lines
2588216 Trivial cosmetic changes
a059914 More header cleanup
9c551c6 Renamed variable
7035738 Misc. code cleanup and memory savings
7339e8c Free memory holding transducer data after parsing
915ef1d Some more const sprinkling
59040cc Misc. nonfunctional cleanup
5852517 Hack to make the test script handle unicode
915a6e2 Added character swaps to edit distance script. You have to enable them with the -s flag - they generate A(A-1)*D new states and twice that many transitions, where A is the size of the alphabet and D is the edit distance. Pretty expensive. Is there a better way?
f77baa8 Improvements to editdist.py - see test/editdist.py --help
fa61c7d Put 1.0 weights on the test generator script
2bea19e Minor enhancement to test script
fb6b65f Added helpful runtime error for alphabet translation problems, updated demo utility to make use of it
31a3561 Better checking of read operations, added relevant exceptions
397a4b5 Made example formatting in README more consistent - I may have broken Tommi's commit a bit, but I think it's ok now...
2e933a8 Minor improvement to demo
a7cf1f3 Static lib and fixes to xamples in readme
161d40d Added comment
b383a42 Reversed previous commit, which did the opposite of what the commit message said it would. Committer will go to bed now...
a4877dc Return results in reverse order by weight, ie. in order of quality (instead of the opposite)
5827928 Removed obsolete dependency on cassert
d8f388e Fixed some comments
5d1d58a One more formatting fix
403d7ed Fixed typo in comment
d49e37d Moved getopt dependency from ospell.h to the demo utility proper
5e1b2f1 Formatting improvements
f11ed65 Introduced an exception for handling alphabet translation failure, fixed typo in help string, updated README
73647b3 Made some changes to correction-storing data structures to make sure each correction string only appears once
25207f7 Fatal bug(s) fixed, (more) correct flag diacritic functionality
57441ea New test script
82620ee Fixed typo
778d477 Fixed a braindead bug that subtly broke everything, this should make some code redundant
2a3ffcf A way to handle flag diacritics
47d35eb Trivial Makefile fix for commandline tester
564e7f2 Replaced some ungracious exits with exceptions and made small change to README
d92e82c Added README and some fixes
bd2c6c3 Implemented spellchecking and correction library functions; documentation, proper packaging and esp. functioning flag diacritics still to be done.
d3ed0b7 Temporarily de-autotooled ospell
7344a8e Incorporated queue in speller proper
dbc3bae Fixed behaviour, added weightedness scaffolding
e4851c1 Corrected typo.
9df6f72 Initial commit of hfst-ospell. Basic functionality including OTHER symbol (@?@) and runtime alphabet translation is present; weighted transducers (probably to be the only option) and flag diacritic states for the mutator and lexicon forthcoming.