Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Decimal Division Mismatch On Some Versions Of Spark #9998

Open
razajafri opened this issue Dec 8, 2023 · 0 comments
Open

[BUG] Decimal Division Mismatch On Some Versions Of Spark #9998

razajafri opened this issue Dec 8, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@razajafri
Copy link
Collaborator

razajafri commented Dec 8, 2023

Describe the bug
When resultDecimalType was introduced in Spark 3.4.0+ and Databricks 330+, we started doing the right thing when dividing DecimalTypes while Spark still gave the wrong results according to the bug

Steps/Code to reproduce bug

scala> val df = Seq(("-0.172787979", "533704665545018957788294905796.5"), ("1", "2")).toDF("_1", "_2")
scala> df.repartition(3).selectExpr("cast(_1 as decimal(9,9)) / cast(_2 as decimal(31,1))").show(false)
+------------------------------------------------------+
|(CAST(_1 AS DECIMAL(9,9)) / CAST(_2 AS DECIMAL(31,1)))|
+------------------------------------------------------+
|-0.0000000000000000000000000000003237520              |
|NULL                                                  |
+------------------------------------------------------+

scala> spark.conf.set("spark.rapids.sql.enabled", false)

scala> df.repartition(3).selectExpr("cast(_1 as decimal(9,9)) / cast(_2 as decimal(31,1))").show(false) 
+------------------------------------------------------+
|(CAST(_1 AS DECIMAL(9,9)) / CAST(_2 AS DECIMAL(31,1)))|
+------------------------------------------------------+
|-0.0000000000000000000000000000003237521              |
|NULL                                                  |
+------------------------------------------------------+

Expected behavior
We should match Spark bug for bug for versions of Databricks 330+ and Spark 340+

Additional context
The original bug was created as part of the audit process. In resolving that as part of the spark-rapids-jni PR, it was discovered that the division got broken on Spark versions listed above after the release of Spark 330db.

@razajafri razajafri added bug Something isn't working ? - Needs Triage Need team to review and classify labels Dec 8, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants