[GLUTEN-11921] Enable Parquet read/write test for NullType#12056
[GLUTEN-11921] Enable Parquet read/write test for NullType#12056wecharyu wants to merge 1 commit into
Conversation
|
|
||
| /** A test suite that tests basic Parquet I/O. */ | ||
| class GlutenParquetIOSuite extends ParquetIOSuite with GlutenSQLTestsBaseTrait { | ||
| override def testNameBlackList: Seq[String] = |
There was a problem hiding this comment.
Can we keep this test excluded in VeloxTestSettings.scala? That way, it won't affect other backends' test enabling controls.
There was a problem hiding this comment.
testGluten("SPARK-54220: xxx will add a prefix into the test name. I assume excludeByPrefix("SPARK-54220") will not exclude this test variant.
There was a problem hiding this comment.
I'd prefer exclude it here because this test failure is not Velox-specific, it's caused by Gluten instead of backends.
Spark test collect leaf scan node FileSourceScanExecTransformer and run executeColumnar() in VerifyNoAdditionalScanOutputExec:
The inputRDD is RDD[UnsafeRow] for parquetVectorizedReaderEnabled=false test path, so it would always failed by:
java.lang.ClassCastException: class org.apache.spark.sql.catalyst.expressions.UnsafeRow cannot be cast to class org.apache.spark.sql.vectorized.ColumnarBatch
philo-he
left a comment
There was a problem hiding this comment.
Thanks for the clarification. I updated the PR description according to my understanding.
Please rebase the code and check if the CI reported failures are related. Thank you!
|
The failure is expected, we need first support it in Velox, PR is facebookincubator/velox#17391 |
@wecharyu, thank you for the clarification. If you need to validate the Velox PR, you can push a temporary commit with the PR ID added in the code below. gluten/ep/build-velox/src/get-velox.sh Line 28 in 97632d8 |
What changes are proposed in this pull request?
Enable parquet NullType read/write test.
With
excludeByPrefix("SPARK-54220")removed, "SPARK-54220: NullType" test is put back for testing. And "SPARK-54220: vectorized reader: missing all struct fields, struct with NullType only" test is replaced by a variant with a check removed since it is only applicable to vanilla Spark.This PR depends on facebookincubator/velox#17391.
How was this patch tested?
Pass UT in
GlutenParquetIOSuite.Was this patch authored or co-authored using generative AI tooling?
No.
Related issue: #11921