Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1278,18 +1278,19 @@ object StaxXmlParser {

// Try parsing the value as decimal
val decimalParser = ExprUtils.getDecimalParser(options.locale)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the issue at hand, decimalParser should be reused rather than initializing for every value.
Can the caller pass it instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have had this conversation in the original PR. Yes, we can, but we need to change a few function signatures. We can improve it later.

allCatch opt decimalParser(value) match {
case Some(decimalValue) =>
var d = decimalValue
if (d.scale() < 0) {
d = d.setScale(0)
}
if (d.scale <= VariantUtil.MAX_DECIMAL16_PRECISION &&
d.precision <= VariantUtil.MAX_DECIMAL16_PRECISION) {
builder.appendDecimal(d)
return
}
case _ =>
try {
var d = decimalParser(value)
if (d.scale() < 0) {
d = d.setScale(0)
}
if (d.scale <= VariantUtil.MAX_DECIMAL16_PRECISION &&
d.precision <= VariantUtil.MAX_DECIMAL16_PRECISION) {
builder.appendDecimal(d)
return
}
} catch {
case NonFatal(_) =>
// Ignore the exception and parse it as a string below
}

// If the character is of other primitive types, parse it as a string
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -942,6 +942,26 @@ class XmlVariantSuite extends QueryTest with SharedSparkSession with TestXmlData
.map(_.getString(0).replaceAll("\\s+", ""))
assert(xmlResult.head === xmlStr)
}

test(
"[SPARK-54099] XML variant parser should fall back to string " +
"when failing to parse decimal values"
) {
// Decimals with extreme exponents. The variant parser should throw ArithmeticException when
// parsing these values as Decimal:
val decimalString = Seq(
"1E+2147483647", // Maximum int exponent - scale would be -2147483647
"5E+1000000000", // 1 billion exponent
"1.23E+999999999", // Very large exponent
"0.001E+2147483640" // Still results in huge effective exponent
)
decimalString.foreach { str =>
testParser(
xml = s"<ROW><decimal>$str</decimal></ROW>",
expectedJsonStr = s"""{"decimal":"$str"}"""
)
}
}
}

class XmlVariantSuiteWithLegacyParser extends XmlVariantSuite {
Expand Down