Summing columns that are pyarrow StringScalar types.... #1316
Unanswered
jayceslesar
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey all, I convert a large csv (~50ish gigs usually but will get bigger) to HDF5 and have Vaex churn through it for a lot of stats and analytics, and am wondering the best way to sum column that is of StringScalar type? How I do it now is
amount = int(np.sum(np.asarray(current_sig_vals.values, dtype=int)))
, and I am wondering if Vaex has a solution to deal with the pyarrow StringScalars (they are all "0" or "1" by the way).current_sig_vals
is the Vaex expression that filters down myVaex.dataframe
to the data I want to look at.Would I want to convert it to an df.sum([column]).astype("int8") or something similar?
Beta Was this translation helpful? Give feedback.
All reactions