-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-50558][SQL] Introduce simpleString for ExpressionSet #49650
base: master
Are you sure you want to change the base?
[SPARK-50558][SQL] Introduce simpleString for ExpressionSet #49650
Conversation
cc @MaxGekk |
|
||
/** Returns a length limited string that must be used for logging only. */ | ||
def simpleString(maxFields: Int): String = | ||
s"Set(${truncatedString(originals.map(_.simpleString(maxFields)).toSeq, ", ", maxFields)})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you convert elements via simpleString
even if it could be not needed. Let's imagine originals
has 10000 elements, and maxFields
is 5. In any case you convert all 10000 elements to strings.
How about to add override truncatedString
w/ additional parameter: a function which converts T
to String
:
truncatedString(originals.toSeq, "Set(", ", ", ")", maxFields, _.simpleString(maxFields))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had the exact same in mind, will pick that up. Default value is toString
…le-string-for-expression-set
seq.take(numFields).map(customToString.get).mkString(start, sep, ending) | ||
} else { | ||
seq.take(numFields).mkString(start, sep, ending) | ||
} | ||
} else { | ||
seq.mkString(start, sep, end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about applying the custom converter here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you write a test for the case, please. Like top elements are less than maxFields, but some nested elem has more than maxFields.
def simpleString(maxFields: Int): String = { | ||
val customToString = { e: Expression => e.simpleString(maxFields) } | ||
s"Set(${SparkStringUtils.truncatedString( | ||
seq = originals.toSeq, start = "", sep = ", ", end = "", maxFields, Some(customToString))})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just set start = "Set("
, and end = ")"
What changes were proposed in this pull request?
Why are the changes needed?
Filter expressions can be arbitrarily large and should not be logged completely in these cases
Does this PR introduce any user-facing change?
No, logging is not user facing
How was this patch tested?
Added new tests
Was this patch authored or co-authored using generative AI tooling?
No