-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to spaCy as the default parser #4
Labels
enhancement
New feature or request
Comments
lukehsiao
changed the title
Switch to spacy as the default parser
Switch to spaCy as the default parser
Feb 7, 2018
Output from CoreNLP on the simple documents:
CoreNLP is splitting different formatting (e.g. italics, bold, etc) into different phrases. |
Inspecting 5 candidates using the code: from fonduer.features import features
cand = []
log = open('scapy_log_features.txt', 'w')
for i, c in enumerate(train_cands):
if c[0].get_span().startswith('BC856') and c[1].get_span() == '150':
print("###", i)
cand.append(c)
print("Candidates: {}".format(len(cand)))
for c in cand:
log.write("Candidate: {}\n".format(c))
for f in list(features.get_all_feats([c])):
log.write(" Feature: {}\n".format(f))
log.close() at the end of the stg_temp_max tutorial. |
lukehsiao
added a commit
that referenced
this issue
Feb 11, 2018
stackoverflowed
pushed a commit
to stackoverflowed/multimodal
that referenced
this issue
Dec 4, 2021
* Update wording of content * Remove paleo tutorial The paleo dataset is too noisy to perform stably without much more data that would be reasonable to run in this tutorial. Moving it to a separate branch.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Support using spaCy as the lingual parser for the old parser (i.e. the one that does not support pdftotree output).
TODO:
The text was updated successfully, but these errors were encountered: