Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timedelta, timedelta64 and datetime64 plus respective conversions #509

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
093bdcb
add support for `@py from <module> import`
Feb 21, 2022
c63ab23
Merge branch 'cjdoris:main' into main
hhaensel Jul 5, 2023
5a65a52
Merge branch 'main' of https://github.com/hhaensel/PythonCall.jl
hhaensel Jun 16, 2024
5740b59
support timedelta, timedelta64, datetime64 and respective conversions
hhaensel Jun 16, 2024
f897600
fix week kw in pytimedelta64, typo (space) in builtins
hhaensel Jun 16, 2024
1e8d410
correct handling of count in 64-bit conversion rules
Jun 17, 2024
b3cc79f
Merge remote-tracking branch 'origin/main' into hh-timedelta64
hhaensel Jul 28, 2024
9dfc0dd
Apply suggestions from code review
hhaensel Sep 6, 2024
a3a2b97
Apply suggestions from code review part II
hhaensel Sep 6, 2024
daf9759
reviewers suggestions part III
hhaensel Sep 6, 2024
60e0daa
Merge branch 'JuliaPy:main' into hh-timedelta64
hhaensel Jan 19, 2025
7391b8d
add tests for pytimedelta
Jan 19, 2025
8f28567
fix micro/millisecond in pytimedelta,
Jan 20, 2025
46efe53
add tests for pytimedelta, pytimedelta64 and conversion of pytimedelt…
Jan 20, 2025
d36c113
fix pytimdelta(years/months=0), add pydatetime64(::Union{Date, DateTi…
hhaensel Jan 21, 2025
66459c6
add tests for pytimedelta64, pydatetime64
hhaensel Jan 21, 2025
3c51b54
support unitless timedelta64, keep unit per default, add keyword cano…
Jan 21, 2025
fac1ef8
add tests for timedelta64 canonicalize
Jan 21, 2025
d00a788
Merge branch 'JuliaPy:main' into hh-timedelta64
hhaensel Jan 21, 2025
5abcf1d
add CondaPkg and DataFrames as extras in Project.toml
Jan 21, 2025
34f35ce
fix pandas testing
Jan 20, 2025
9864173
fix compat with julia < 1.8
Jan 21, 2025
3148bae
specialize Base.convert(::<:Period, CompoundPeriod) for julia < 1.8
Jan 21, 2025
94b47df
change CondaPkg environment
Jan 21, 2025
f66f526
fix test/Project.toml, adapt runtests.jl
Jan 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/Convert/Convert.jl
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ using ..Core:
pythrow,
pybool_asbool
using Dates: Date, Time, DateTime, Second, Millisecond, Microsecond, Nanosecond
using Dates: Year, Month, Day, Hour, Minute, Week, Period, CompoundPeriod, canonicalize

import ..Core: pyconvert

Expand Down
70 changes: 70 additions & 0 deletions src/Convert/numpy.jl
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,66 @@ const NUMPY_SIMPLE_TYPES = [
("complex128", ComplexF64),
]

function pydatetime64(
_year::Int=0, _month::Int=1, _day::Int=1, _hour::Int=0, _minute::Int=0,_second::Int=0, _millisecond::Int=0, _microsecond::Int=0, _nanosecond::Int=0;
year::Int=_year, month::Int=_month, day::Int=_day, hour::Int=_hour, minute::Int=_minute, second::Int=_second,
millisecond::Int=_millisecond, microsecond::Int=_microsecond, nanosecond::Int=_nanosecond
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
)
pyimport("numpy").datetime64("$(DateTime(year, month, day, hour, minute, second))") + pytimedelta64(;millisecond, microsecond, nanosecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is pyimport("numpy") the correct API call, or is that just to be used in user packages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw similar calls at different places in the package, so I took this approach. But I also wouldn't know how to code a timedelta64 without calling pyimport.
Please let me know if there's a better solution.

end
function pydatetime64(@nospecialize(x::T)) where T <: Period
T <: Union{Week, Day, Hour, Minute, Second, Millisecond, Microsecond} ||
error("Unsupported Period type: ", "Year, Month and Nanosecond are not supported, consider using pytimedelta64 instead.")
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
args = T .== (Day, Second, Millisecond, Microsecond, Minute, Hour, Week)
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
pydatetime64(x.value .* args...)
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to rewrite this with Base.Cartesian.@nif rather than doing a masked sum, since you know there will be only 1 element in the sum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sum is dammed fast (16ns), and I couldn't beat it with a different version.

end
function pydatetime64(x::CompoundPeriod)
x = canonicalize(x)
isempty(x.periods) ? pydatetime64(Second(0)) : sum(pydatetime64.(x.periods))
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
end
export pydatetime64

function pytimedelta64(
_year::Int=0, _month::Int=0, _day::Int=0, _hour::Int=0, _minute::Int=0, _second::Int=0, _millisecond::Int=0, _microsecond::Int=0, _nanosecond::Int=0, _week::Int=0;
year::Int=_year, month::Int=_month, day::Int=_day, hour::Int=_hour, minute::Int=_minute, second::Int=_second, microsecond::Int=_microsecond, millisecond::Int=_millisecond, nanosecond::Int=_nanosecond, week::Int=_week)
pytimedelta64(sum((
Year(year), Month(month), # you cannot mix year or month with any of the below units in python, the error will be thrown by `pytimedelta64(::CompoundPeriod)`
Copy link
Contributor

@MilesCranmer MilesCranmer Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment

you cannot mix year or month with any of the below units in python, the error will be thrown by pytimedelta64(::CompoundPeriod)

Should be presented to the user as a descriptive error message rather than a comment in the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the comment isn't clear enough.
Python throws a well understandable descriptive error in case of wrong usage, so no need for us to do so. Agree?

Day(day), Hour(hour), Minute(minute), Second(second), Millisecond(millisecond), Microsecond(microsecond), Nanosecond(nanosecond), Week(week))
))
end
function pytimedelta64(@nospecialize(x::T)) where T <: Period
index = findfirst(==(T), (Year, Month, Week, Day, Hour, Minute, Second, Millisecond, Microsecond, Nanosecond, T))
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
unit = ("Y", "M", "W", "D", "h", "m", "s", "ms", "us", "ns", "")[index]
pyimport("numpy").timedelta64(x.value, unit)
end
function pytimedelta64(x::CompoundPeriod)
x = canonicalize(x)
isempty(x.periods) ? pytimedelta64(Second(0)) : sum(pytimedelta64.(x.periods))
end
export pytimedelta64

function pyconvert_rule_datetime64(::Type{DateTime}, x::Py)
unit, count = pyconvert(Tuple, pyimport("numpy").datetime_data(x))
value = reinterpret(Int64, pyconvert(Vector, x))[1]
units = ("Y", "M", "W", "D", "h", "m", "s", "ms", "us", "ns")
types = (Year, Month, Week, Day, Hour, Minute, Second, Millisecond, Microsecond, Nanosecond)
T = types[findfirst(==(unit), units)]
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
pyconvert_return(DateTime(_base_datetime) + T(value * count))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to other comment – you should write this using Base.Cartesian.@nif over the types tuple to avoid dynamic dispatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, tested with a julia function calls and found that the julia part is around 25ns, whereas the python call is around 1.5microsecond

end

function pyconvert_rule_timedelta64(::Type{CompoundPeriod}, x::Py)
unit, count = pyconvert(Tuple, pyimport("numpy").datetime_data(x))
value = reinterpret(Int64, pyconvert(Vector, x))[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is reinterpret safe here? Is there a better alternative to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought, pyconvert creates a new Julia Vector which is not mapped onto Python data. If that would be the case, we'd need to wrap the vector by a copy().

units = ("Y", "M", "W", "D", "h", "m", "s", "ms", "us", "ns")
types = (Year, Month, Week, Day, Hour, Minute, Second, Millisecond, Microsecond, Nanosecond)
T = types[findfirst(==(unit), units)]
hhaensel marked this conversation as resolved.
Show resolved Hide resolved
pyconvert_return(CompoundPeriod(T(value * count)) |> canonicalize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proper way to do this would be to use Base.Cartesian.@nif. That way you could write this code to avoid dynamic dispatch on types (which will be very slow).

end

function pyconvert_rule_timedelta64(::Type{T}, x::Py) where T<:Period
pyconvert_return(convert(T, pyconvert_rule_timedelta64(CompoundPeriod, x)))
end

function init_numpy()
for (t, T) in NUMPY_SIMPLE_TYPES
isbool = occursin("bool", t)
Expand Down Expand Up @@ -54,4 +114,14 @@ function init_numpy()
iscomplex && pyconvert_add_rule(name, Complex, rule)
isnumber && pyconvert_add_rule(name, Number, rule)
end

priority = PYCONVERT_PRIORITY_ARRAY
pyconvert_add_rule("numpy:datetime64", DateTime, pyconvert_rule_datetime64, priority)
for T in (CompoundPeriod, Year, Month, Day, Hour, Minute, Second, Millisecond, Microsecond, Nanosecond, Week)
pyconvert_add_rule("numpy:timedelta64", T, pyconvert_rule_timedelta64, priority)
end
Copy link
Contributor

@MilesCranmer MilesCranmer Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Julia is unlikely to unroll this loop, you should use Base.Cartesian.@nexprs here to avoid dynamic dispatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried my best, but I'm not sure how to test whether this will speed up things


priority = PYCONVERT_PRIORITY_CANONICAL
pyconvert_add_rule("numpy:datetime64", DateTime, pyconvert_rule_datetime64, priority)
pyconvert_add_rule("numpy:timedelta64", Nanosecond, pyconvert_rule_timedelta, priority)
end
6 changes: 6 additions & 0 deletions src/Convert/pyconvert.jl
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,12 @@ function init_pyconvert()
pyimport("collections.abc" => ("Iterable", "Sequence", "Set", "Mapping"))...,
)

priority = PYCONVERT_PRIORITY_ARRAY
pyconvert_add_rule("datetime:datetime", DateTime, pyconvert_rule_datetime, priority)
for T in (Millisecond, Second, Nanosecond, Day, Hour, Minute, Second, Millisecond, Week, CompoundPeriod)
pyconvert_add_rule("datetime:timedelta", T, pyconvert_rule_timedelta, priority)
end

priority = PYCONVERT_PRIORITY_CANONICAL
pyconvert_add_rule("builtins:NoneType", Nothing, pyconvert_rule_none, priority)
pyconvert_add_rule("builtins:bool", Bool, pyconvert_rule_bool, priority)
Expand Down
13 changes: 13 additions & 0 deletions src/Convert/rules.jl
Original file line number Diff line number Diff line change
Expand Up @@ -512,3 +512,16 @@ function pyconvert_rule_timedelta(::Type{Second}, x::Py)
end
return Second(days * 3600 * 24 + seconds)
end

function pyconvert_rule_timedelta(::Type{<:CompoundPeriod}, x::Py)
days = pyconvert(Int, x.days)
seconds = pyconvert(Int, x.seconds)
microseconds = pyconvert(Int, x.microseconds)
nanoseconds = pyhasattr(x, "nanoseconds") ? pyconvert(Int, x.nanoseconds) : 0
timedelta = Day(days) + Second(seconds) + Microsecond(microseconds) + Nanosecond(nanoseconds)
return pyconvert_return(timedelta)
end

function pyconvert_rule_timedelta(::Type{T}, x::Py) where T<:Period
pyconvert_return(convert(T, pyconvert_rule_timedelta(CompoundPeriod, x)))
end
12 changes: 11 additions & 1 deletion src/Core/Core.jl
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,17 @@ using Dates:
second,
millisecond,
microsecond,
nanosecond
nanosecond,
Day,
Hour,
Week,
Minute,
Second,
Millisecond,
Microsecond,
Period,
CompoundPeriod,
canonicalize
using MacroTools: MacroTools, @capture
using Markdown: Markdown

Expand Down
1 change: 1 addition & 0 deletions src/Core/Py.jl
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,7 @@ Py(
Py(x::Date) = pydate(x)
Py(x::Time) = pytime(x)
Py(x::DateTime) = pydatetime(x)
Py(x::Union{Period, CompoundPeriod}) = pytimedelta(x)

Base.string(x::Py) = pyisnull(x) ? "<py NULL>" : pystr(String, x)
Base.print(io::IO, x::Py) = print(io, string(x))
Expand Down
18 changes: 18 additions & 0 deletions src/Core/builtins.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1167,6 +1167,24 @@ end
pydatetime(x::Date) = pydatetime(year(x), month(x), day(x))
export pydatetime

function pytimedelta(
_day::Int=0, _second::Int=0, _microsecond::Int=0, _millisecond::Int=0, _minute::Int=0, _hour::Int=0, _week::Int=0;
day::Int=_day, second::Int=_second, microsecond::Int=_microsecond, millisecond::Int=_millisecond, minute::Int=_minute, hour::Int=_hour, week::Int=_week
)
pyimport("datetime").timedelta(day, second, microsecond, millisecond, minute, hour, week)
end
function pytimedelta(@nospecialize(x::T)) where T <: Period
T <: Union{Week, Day, Hour, Minute, Second, Millisecond, Microsecond} ||
error("Unsupported Period type: ", "Year, Month and Nanosecond are not supported, consider using pytimedelta64 instead.")
args = T .== (Day, Second, Millisecond, Microsecond, Minute, Hour, Week)
pytimedelta(x.value .* args...)
end
function pytimedelta(x::CompoundPeriod)
x = canonicalize(x)
isempty(x.periods) ? pytimedelta(Second(0)) : sum(pytimedelta.(x.periods))
end
export pytimedelta

function pytime_isaware(x)
tzinfo = pygetattr(x, "tzinfo")
if pyisnone(tzinfo)
Expand Down
16 changes: 16 additions & 0 deletions src/PyMacro/PyMacro.jl
Original file line number Diff line number Diff line change
Expand Up @@ -886,6 +886,9 @@ For example:
- `import x: f as g` is translated to `g = pyimport("x" => "f")` (`from x import f as g` in Python)

Compound statements such as `begin`, `if`, `while` and `for` are supported.
Import statements are supported, e.g.
- `import foo, bar`
- `from os.path import join as py_joinpath, exists`

See the online documentation for more details.

Expand All @@ -895,6 +898,19 @@ See the online documentation for more details.
macro py(ex)
esc(py_macro(ex, __module__, __source__))
end

macro py(keyword, modulename, ex)
keyword == :from || return :( nothing )

d = Dict(isa(a.args[1], Symbol) ? a.args[1] => a.args[1] : a.args[1].args[1] => a.args[2] for a in ex.args)
vars = Expr(:tuple, values(d)...)
imports = Tuple(keys(d))

esc(quote
$vars = pyimport($(string(modulename)) => $(string.(imports)))
end)
end

export @py

end
Loading