-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eCourts Portal import proof of concept #431
Merged
+5,255
−7
Merged
Changes from all commits
Commits
Show all changes
34 commits
Select commit
Hold shift + click to select a range
671c436
portal import proof of concept
copelco b5a86f7
Automatically embed username (#432)
robert-w-gries 343ed34
run prettier
copelco f176a8e
iterate on parsing model; parse charges
copelco 268bf4f
parse case type
copelco a7277e8
parse case status
copelco c2fb97e
refactor into parser modules
copelco b5b6388
remove unused loggers
copelco d654ca2
reconnect with transform_portal_record
copelco 1b75985
fix comment
copelco dc82bf7
fix tests
copelco ede3c48
fix import
copelco 770f5f9
start parsing dispositions
copelco 6d8be85
transform offenses
copelco 9519ed9
test parsers
copelco aa3f672
use localhost during development
copelco 48a49ce
add env name to bookmarklet
copelco 674a676
save source HTML in record data
copelco dbdbd60
run create_batch_petitions
copelco ccdb059
Merge remote-tracking branch 'origin/master' into portal-import
copelco 17d12a3
run prettier
copelco d0fb35e
transform jurisdiction, offense date, offense action, and offense dis…
copelco 1cc027b
remove README
copelco 30ecee0
reorg parser tests
copelco 73f0d56
migrate constants
copelco 109fdf2
Merge branch 'master' into portal-import
copelco f0b6c2a
add severity constants
copelco 8f8ff44
Merge branch 'portal-import' of github.com:deardurham/dear-petition i…
copelco b402407
add extract test
copelco 0881bc7
test transform
copelco f9ea96f
Update dear_petition/portal/etl/parsers/case_info.py
copelco 6ddd3e3
add success alert
copelco bcbffef
save page address to metadata
copelco b0c1516
use window.location.origin for bookmarklet
copelco File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
from django.apps import AppConfig | ||
|
||
|
||
class PortalConfig(AppConfig): | ||
default_auto_field = "django.db.models.BigAutoField" | ||
name = "dear_petition.portal" |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
from bs4 import BeautifulSoup | ||
|
||
from .models import CaseSummary, PartyInfo, PortalRecord | ||
from .parsers import case_summary, dispositions, case_info, party_info | ||
|
||
|
||
def extract_portal_record(source): | ||
"""Parse HTML source to extract eCourts Portal record""" | ||
soup = BeautifulSoup(source, features="html.parser") | ||
return PortalRecord( | ||
case_summary=parse_case_summary(soup), | ||
case_info=case_info.parse_case_information(soup), | ||
party_info=parse_party_information(soup), | ||
dispositions=dispositions.parse_dispositions(soup), | ||
) | ||
|
||
|
||
def parse_case_summary(soup): | ||
"""Case Summary section""" | ||
return CaseSummary( | ||
case_number=case_summary.parse_case_number(soup) or "", | ||
county=case_summary.parse_county(soup) or "", | ||
court=case_summary.parse_court(soup) or "", | ||
) | ||
|
||
|
||
def parse_party_information(soup): | ||
"""Party Information section""" | ||
return PartyInfo(defendant_name=party_info.parse_defendant_name(soup)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
import logging | ||
|
||
from dear_petition.petition.models import Batch, CIPRSRecord | ||
from dear_petition.petition.etl.load import create_batch_petitions | ||
|
||
from .transform import transform_portal_record | ||
|
||
__all__ = ("import_portal_record",) | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
def import_portal_record(user, source: str, location: str): | ||
"""Import eCourts Portal records into models.""" | ||
logger.info("Importing Portal record") | ||
data = transform_portal_record(source, location) | ||
batch, _ = Batch.objects.get_or_create(user=user, label=data["Defendant"]["Name"]) | ||
record = CIPRSRecord(batch=batch, data=data) | ||
record.refresh_record_from_data() | ||
record.save() | ||
create_batch_petitions(batch) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
import datetime as dt | ||
from typing import List, Union | ||
|
||
from pydantic import BaseModel, field_validator | ||
|
||
from dear_petition.petition import constants | ||
|
||
|
||
class CaseSummary(BaseModel): | ||
case_number: str | ||
county: str | ||
court: str | ||
|
||
|
||
class Charge(BaseModel): | ||
number: Union[int, None] | ||
offense: str | ||
statute: str | ||
degree: str | ||
offense_date: Union[dt.date, None] | ||
filed_date: Union[dt.date, None] | ||
|
||
@field_validator("offense_date", "filed_date", mode="before") | ||
@classmethod | ||
def parse_date(cls, v): | ||
if isinstance(v, str): | ||
return dt.datetime.strptime(v, "%m/%d/%Y") | ||
return v | ||
|
||
def transform_severity(self): | ||
"""Attempt to convert Portal's degree to CIPRS severity""" | ||
severity = self.degree | ||
if self.degree in constants.CHARGED_DEGREE_FELONY: | ||
severity = constants.SEVERITY_FELONY | ||
elif self.degree in constants.CHARGED_DEGREE_MISDEMEANOR: | ||
severity = constants.SEVERITY_MISDEMEANOR | ||
return severity | ||
|
||
|
||
class CaseInfo(BaseModel): | ||
case_type: str | ||
case_status: str | ||
case_status_date: Union[dt.date, None] | ||
charges: List[Charge] | ||
|
||
@field_validator("case_status_date", mode="before") | ||
@classmethod | ||
def parse_date(cls, v): | ||
if isinstance(v, str): | ||
return dt.datetime.strptime(v, "%m/%d/%Y") | ||
return v | ||
|
||
|
||
class PartyInfo(BaseModel): | ||
defendant_name: str | ||
|
||
|
||
class Disposition(BaseModel): | ||
event_date: Union[dt.date, None] | ||
event: str | ||
charge_number: int | ||
charge_offense: str | ||
criminal_disposition: str | ||
|
||
@field_validator("event_date", mode="before") | ||
@classmethod | ||
def parse_date(cls, v): | ||
if isinstance(v, str): | ||
return dt.datetime.strptime(v, "%m/%d/%Y") | ||
return v | ||
|
||
def is_dismissed(self) -> bool: | ||
return self.criminal_disposition in constants.DISMISSED_DISPOSITION_METHODS | ||
|
||
def transform_action(self) -> str: | ||
action = self.event | ||
if self.is_dismissed(): | ||
action = constants.CHARGED | ||
return action | ||
|
||
def transform_disposition_method(self) -> str: | ||
if self.is_dismissed(): | ||
return constants.DISTRICT_COURT_WITHOUT_DA_LEAVE | ||
return self.criminal_disposition | ||
|
||
|
||
class PortalRecord(BaseModel): | ||
case_summary: CaseSummary | ||
case_info: CaseInfo | ||
party_info: PartyInfo | ||
dispositions: List[Disposition] | ||
|
||
def get_charge_by_number(self, charge_number: int): | ||
"""Return matching CaseInfo.charges Charge by charge_number""" | ||
for charge in self.case_info.charges: | ||
if charge.number == charge_number: | ||
return charge | ||
|
||
def transform_offense_date(self) -> dt.date: | ||
offense_dates = [c.offense_date for c in self.case_info.charges] | ||
return min(offense_dates).isoformat() |
Empty file.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the date string is in a different format? That's a common issue with CIPRS pdfs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it'll throw a ValidationError, which hopefully will get logged in Sentry.