Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync with base #4

Merged
merged 51 commits into from
May 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
dee176c
ci(rust): Pin coverage job to MacOS 13 for now (#15918)
stinodego Apr 26, 2024
e0d242d
Revert "build: use jemalloc in lts-cpu" (#15924)
ritchie46 Apr 27, 2024
ec1e4dc
build: pin mimalloc and macos-13 (#15925)
ritchie46 Apr 27, 2024
4a995b4
build: replace all macos-latest referrals with macos-13 (#15926)
ritchie46 Apr 27, 2024
3564a77
feat(rust!): Rename to `CsvParserOptions` to `CsvReaderOptions`, use …
stinodego Apr 27, 2024
4e7a0e1
fix: Remove ffspec from parquet reader (#15927)
ritchie46 Apr 27, 2024
f1846a9
feat: Add option to disable globbing in parquet (#15928)
ritchie46 Apr 27, 2024
7ae1e58
fix(python): series.search_sorted could support more types of input (…
reswqa Apr 28, 2024
49ef964
fix(python): Change recognition of numba ufunc (#15916)
deanm0000 Apr 28, 2024
9a1d8ae
feat: Add option to disable globbing in csv (#15930)
ritchie46 Apr 28, 2024
d247f1b
feat(python): don't require pyarrow for converting pandas to Polars i…
MarcoGorelli Apr 28, 2024
14b352f
fix(rust): typo in add_half_life takes ln(negative) (#15932)
jr200 Apr 28, 2024
031c926
fix: Set default limit for String column display to 30 and fix edge c…
stinodego Apr 28, 2024
c1474f6
build: Use default allocator for lts-cpu (#15941)
ritchie46 Apr 28, 2024
b6441d0
build: Don't import jemalloc (#15942)
ritchie46 Apr 28, 2024
ced6cde
test(python): Fix failing test (#15936)
stinodego Apr 29, 2024
2e28176
fix(python): Add missing "truncate_ragged_lines" parameter to `read_c…
alexander-beedie Apr 29, 2024
95cbf34
fix: Join validation for multiple keys (#15947)
ritchie46 Apr 29, 2024
c3f4201
feat(rust, python): Add `by` argument for `Expr.top_k` and `Expr.bott…
CanglongCl Apr 29, 2024
3bf32f0
fix: do not panic when comparing against categorical with incompatibl…
c-peters Apr 29, 2024
2805eca
fix(python): Fix dtype parameter in `pandas_to_pyseries` function (#1…
Apr 29, 2024
f0dbb6a
refactor(rust!): prepare for join coalescing argument (#15418)
ritchie46 Apr 29, 2024
9c96dca
fix: Finish adding `typed_lit` to help schema determination in SQL "e…
alexander-beedie Apr 30, 2024
b285a7f
feat: Add typed collection from par iterators (#15961)
ritchie46 Apr 30, 2024
81f4ac2
feat(python): Expose plan and expression nodes through `NodeTraverser…
wence- Apr 30, 2024
4b23768
chore(python): Even more Pyo3 0.21 Bound<> APIs (#15914)
itamarst Apr 30, 2024
075817b
fix(rust): Do not reverse null indices in descending arg_sort (#15974)
wence- Apr 30, 2024
9771e94
docs: Update link to R API docs (#15973)
eitsupi Apr 30, 2024
c9e786b
test(python): Set up TPC-H benchmark tests (#15908)
stinodego Apr 30, 2024
31eaabe
feat: Support Decimal read from IPC (#15965)
nameexhaustion May 1, 2024
556cc83
feat(python): Add post-optimization callback (#15972)
ritchie46 May 1, 2024
be09246
fix: Fix PartialEq for DataType::Unknown (#15992)
ritchie46 May 1, 2024
8929395
fix: Treat splitting by empty string as iterating over chars (#15922)
haocheng6 May 1, 2024
8bc4db5
ci: bump crate-ci/typos from 1.20.10 to 1.21.0 (#15990)
dependabot[bot] May 1, 2024
e900f72
chore(python): bump typos from 1.20.10 to 1.21.0 in /py-polars (#15985)
dependabot[bot] May 1, 2024
65bbdaf
chore(python): bump pytest from 8.1.1 to 8.2.0 in /py-polars (#15986)
dependabot[bot] May 1, 2024
80233b8
fix: Ternary supertype dynamics (#15995)
ritchie46 May 1, 2024
6f8274d
chore(python): bump pytest-xdist from 3.5.0 to 3.6.1 in /py-polars (#…
dependabot[bot] May 1, 2024
c3c5bef
fix(python): converting from numpy datetime64 and overriding dtype wi…
MarcoGorelli May 1, 2024
d5cf038
feat: Additional `uint` datatype support for the SQL interface (#15993)
alexander-beedie May 2, 2024
ebd8aec
fix: Crash/incorrect group_by/n_unique on categoricals created by (q)…
nameexhaustion May 2, 2024
5062732
test(python): Improve hypothesis strategy for decimals (#16001)
stinodego May 2, 2024
5caf2d8
docs(python): Improve user-guide doc of UDF (#15923)
May 2, 2024
e259d7f
refactor: Add some comments (#16008)
ritchie46 May 2, 2024
51f507f
feat: Improve dynamic supertypes (#16009)
ritchie46 May 2, 2024
414e5f6
docs(python): correct default in rolling_* function examples (#16000)
MarcoGorelli May 2, 2024
b6e0844
fix: Fix CSE case where upper plan has no projection (#16011)
ritchie46 May 2, 2024
bf955d9
fix: properly handle nulls in DictionaryArray::iter_typed (#16013)
orlp May 2, 2024
f03e7e0
docs(python): Remove unwanted linebreaks from docstrings (#16002)
bertiewooster May 2, 2024
b7b3da6
feat: Convert concat during IR conversion (#16016)
ritchie46 May 2, 2024
864e750
feat: raise more informative error messages in rolling_* aggregations…
MarcoGorelli May 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint-global.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ jobs:
- name: Lint Markdown and TOML
uses: dprint/[email protected]
- name: Spell Check with Typos
uses: crate-ci/typos@v1.20.10
uses: crate-ci/typos@v1.21.0
9 changes: 5 additions & 4 deletions .github/workflows/release-python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,14 +86,14 @@ jobs:
fail-fast: false
matrix:
package: [polars, polars-lts-cpu, polars-u64-idx]
os: [ubuntu-latest, macos-latest, windows-32gb-ram]
os: [ubuntu-latest, macos-13, windows-32gb-ram]
architecture: [x86-64, aarch64]
exclude:
- os: windows-32gb-ram
architecture: aarch64

env:
SED_INPLACE: ${{ matrix.os == 'macos-latest' && '-i ''''' || '-i'}}
SED_INPLACE: ${{ matrix.os == 'macos-13' && '-i ''''' || '-i'}}
CPU_CHECK_MODULE: py-polars/polars/_cpu_check.py

steps:
Expand Down Expand Up @@ -128,7 +128,7 @@ jobs:
if: matrix.architecture == 'x86-64'
env:
IS_LTS_CPU: ${{ matrix.package == 'polars-lts-cpu' }}
IS_MACOS: ${{ matrix.os == 'macos-latest' }}
IS_MACOS: ${{ matrix.os == 'macos-13' }}
# IMPORTANT: All features enabled here should also be included in py-polars/polars/_cpu_check.py
run: |
if [[ "$IS_LTS_CPU" = true ]]; then
Expand All @@ -144,6 +144,7 @@ jobs:
if: matrix.architecture == 'x86-64'
env:
FEATURES: ${{ steps.features.outputs.features }}
CFG: ${{ matrix.package == 'polars-lts-cpu' && '--cfg default_allocator' || '' }}
run: echo "RUSTFLAGS=-C target-feature=${{ steps.features.outputs.features }} $CFG" >> $GITHUB_ENV

- name: Set variables in CPU check module
Expand All @@ -159,7 +160,7 @@ jobs:
if: matrix.architecture == 'aarch64'
id: target
run: |
TARGET=${{ matrix.os == 'macos-latest' && 'aarch64-apple-darwin' || 'aarch64-unknown-linux-gnu'}}
TARGET=${{ matrix.os == 'macos-13' && 'aarch64-apple-darwin' || 'aarch64-unknown-linux-gnu'}}
echo "target=$TARGET" >> $GITHUB_OUTPUT

- name: Set jemalloc for aarch64 Linux
Expand Down
6 changes: 4 additions & 2 deletions .github/workflows/test-coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,9 @@ jobs:
coverage-rust:
# Running under ubuntu doesn't seem to work:
# https://github.com/pola-rs/polars/issues/14255
runs-on: macos-latest
# Pinned on macos-13 because latest does not work:
# https://github.com/pola-rs/polars/issues/15917
runs-on: macos-13
steps:
- uses: actions/checkout@v4

Expand Down Expand Up @@ -85,7 +87,7 @@ jobs:
coverage-python:
# Running under ubuntu doesn't seem to work:
# https://github.com/pola-rs/polars/issues/14255
runs-on: macos-latest
runs-on: macos-13
steps:
- uses: actions/checkout@v4

Expand Down
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
.yarn/
coverage.lcov
coverage.xml
data/
polars/vendor

# OS
Expand All @@ -32,6 +31,12 @@ __pycache__/
.cargo/
target/

# Data
*.csv
*.parquet
*.feather
*.tbl

# Project
/docs/data/
/docs/images/
Expand Down
5 changes: 3 additions & 2 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
Copyright (c) 2020 Ritchie Vink
Some portions Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
-
<a href="https://pola-rs.github.io/nodejs-polars/index.html">Node.js</a>
-
<a href="https://rpolars.github.io/index.html">R</a>
<a href="https://pola-rs.github.io/r-polars/index.html">R</a>
|
<b>StackOverflow</b>:
<a href="https://stackoverflow.com/questions/tagged/python-polars">Python</a>
Expand Down
2 changes: 1 addition & 1 deletion _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@ extend-glob = ["*.gz"]
check-file = false

[files]
extend-exclude = ["_typos.toml"]
extend-exclude = ["_typos.toml", "dists.dss"]
22 changes: 8 additions & 14 deletions crates/polars-arrow/src/array/dictionary/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,9 @@ use polars_error::{polars_bail, PolarsResult};
use super::primitive::PrimitiveArray;
use super::specification::check_indexes;
use super::{new_empty_array, new_null_array, Array};
use crate::array::dictionary::typed_iterator::{DictValue, DictionaryValuesIterTyped};
use crate::array::dictionary::typed_iterator::{
DictValue, DictionaryIterTyped, DictionaryValuesIterTyped,
};

/// Trait denoting [`NativeType`]s that can be used as keys of a dictionary.
/// # Safety
Expand Down Expand Up @@ -241,30 +243,22 @@ impl<K: DictionaryKey> DictionaryArray<K> {
///
/// # Panics
///
/// Panics if the keys of this [`DictionaryArray`] have any null types.
/// If they do [`DictionaryArray::iter_typed`] should be called
/// Panics if the keys of this [`DictionaryArray`] has any nulls.
/// If they do [`DictionaryArray::iter_typed`] should be used.
pub fn values_iter_typed<V: DictValue>(&self) -> PolarsResult<DictionaryValuesIterTyped<K, V>> {
let keys = &self.keys;
assert_eq!(keys.null_count(), 0);
let values = self.values.as_ref();
let values = V::downcast_values(values)?;
Ok(unsafe { DictionaryValuesIterTyped::new(keys, values) })
Ok(DictionaryValuesIterTyped::new(keys, values))
}

/// Returns an iterator over the optional values of [`Option<V::IterValue>`].
///
/// # Panics
///
/// This function panics if the `values` array
pub fn iter_typed<V: DictValue>(
&self,
) -> PolarsResult<ZipValidity<V::IterValue<'_>, DictionaryValuesIterTyped<K, V>, BitmapIter>>
{
pub fn iter_typed<V: DictValue>(&self) -> PolarsResult<DictionaryIterTyped<K, V>> {
let keys = &self.keys;
let values = self.values.as_ref();
let values = V::downcast_values(values)?;
let values_iter = unsafe { DictionaryValuesIterTyped::new(keys, values) };
Ok(ZipValidity::new_with_validity(values_iter, self.validity()))
Ok(DictionaryIterTyped::new(keys, values))
}

/// Returns the [`ArrowDataType`] of this [`DictionaryArray`]
Expand Down
70 changes: 68 additions & 2 deletions crates/polars-arrow/src/array/dictionary/typed_iterator.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use polars_error::{polars_err, PolarsResult};

use super::DictionaryKey;
use crate::array::{Array, PrimitiveArray, Utf8Array, Utf8ViewArray};
use crate::array::{Array, PrimitiveArray, StaticArray, Utf8Array, Utf8ViewArray};
use crate::trusted_len::TrustedLen;
use crate::types::Offset;

Expand Down Expand Up @@ -85,7 +85,8 @@ pub struct DictionaryValuesIterTyped<'a, K: DictionaryKey, V: DictValue> {
}

impl<'a, K: DictionaryKey, V: DictValue> DictionaryValuesIterTyped<'a, K, V> {
pub(super) unsafe fn new(keys: &'a PrimitiveArray<K>, values: &'a V) -> Self {
pub(super) fn new(keys: &'a PrimitiveArray<K>, values: &'a V) -> Self {
assert_eq!(keys.null_count(), 0);
Self {
keys,
values,
Expand Down Expand Up @@ -137,3 +138,68 @@ impl<'a, K: DictionaryKey, V: DictValue> DoubleEndedIterator
}
}
}

pub struct DictionaryIterTyped<'a, K: DictionaryKey, V: DictValue> {
keys: &'a PrimitiveArray<K>,
values: &'a V,
index: usize,
end: usize,
}

impl<'a, K: DictionaryKey, V: DictValue> DictionaryIterTyped<'a, K, V> {
pub(super) fn new(keys: &'a PrimitiveArray<K>, values: &'a V) -> Self {
Self {
keys,
values,
index: 0,
end: keys.len(),
}
}
}

impl<'a, K: DictionaryKey, V: DictValue> Iterator for DictionaryIterTyped<'a, K, V> {
type Item = Option<V::IterValue<'a>>;

#[inline]
fn next(&mut self) -> Option<Self::Item> {
if self.index == self.end {
return None;
}
let old = self.index;
self.index += 1;
unsafe {
if let Some(key) = self.keys.get_unchecked(old) {
let idx = key.as_usize();
Some(Some(self.values.get_unchecked(idx)))
} else {
Some(None)
}
}
}

#[inline]
fn size_hint(&self) -> (usize, Option<usize>) {
(self.end - self.index, Some(self.end - self.index))
}
}

unsafe impl<'a, K: DictionaryKey, V: DictValue> TrustedLen for DictionaryIterTyped<'a, K, V> {}

impl<'a, K: DictionaryKey, V: DictValue> DoubleEndedIterator for DictionaryIterTyped<'a, K, V> {
#[inline]
fn next_back(&mut self) -> Option<Self::Item> {
if self.index == self.end {
None
} else {
self.end -= 1;
unsafe {
if let Some(key) = self.keys.get_unchecked(self.end) {
let idx = key.as_usize();
Some(Some(self.values.get_unchecked(idx)))
} else {
Some(None)
}
}
}
}
}
2 changes: 1 addition & 1 deletion crates/polars-arrow/src/array/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ impl std::fmt::Debug for dyn Array + '_ {
match self.data_type().to_physical_type() {
Null => fmt_dyn!(self, NullArray, f),
Boolean => fmt_dyn!(self, BooleanArray, f),
Primitive(primitive) => with_match_primitive_type!(primitive, |$T| {
Primitive(primitive) => with_match_primitive_type_full!(primitive, |$T| {
fmt_dyn!(self, PrimitiveArray<$T>, f)
}),
BinaryView => fmt_dyn!(self, BinaryViewArray, f),
Expand Down
2 changes: 1 addition & 1 deletion crates/polars-arrow/src/legacy/kernels/ewm/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ impl EWMOptions {
}
pub fn and_half_life(mut self, half_life: f64) -> Self {
assert!(half_life > 0.0);
self.alpha = 1.0 - ((-2.0f64).ln() / half_life).exp();
self.alpha = 1.0 - (-(2.0f64.ln()) / half_life).exp();
self
}
pub fn and_com(mut self, com: f64) -> Self {
Expand Down
Loading
Loading