DRILL-8239: Convert JSON UDF to EVF#2567
Open
cgivre wants to merge 1 commit into
Open
Conversation
vdiravka
reviewed
Jun 7, 2022
fc1a320 to
8dd00f8
Compare
63023ef to
5ab0489
Compare
87a0d60 to
25c9069
Compare
6b65419 to
3bfb116
Compare
Rewrite the convert_fromJSON UDFs to use the EVF JsonLoader (ResultSetLoader)
instead of the legacy JsonReader, mirroring the HTTP storage plugin UDFs.
JsonConverterUtils builds the loader from either the system JSON options or the
explicit allTextMode/readNumbersAsDouble arguments, and centralises the per-row
conversion.
To preserve the full convert_fromJSON contract, the EVF complex-writer support
in ProjectRecordBatch is extended:
* Multiple complex-writer functions per project list. addLoader now keeps a
list of (loader, output-column) pairs -- captured at codegen in
DrillComplexWriterFuncHolder -- so each loader's output lands in the column
reserved for it (fixes SELECT convert_from(a) m1, convert_from(b) m2 and
cases that project columns before/after the function).
* Top-level scalars and arrays. The UDF wraps each input value in a single
marker field so the record-oriented loader reads {scalar, array, object}
uniformly; ProjectRecordBatch unwraps that marker column by transferring it
directly (preserving the value's own type), and otherwise wraps the loader's
columns in a map (the HTTP-UDF behaviour).
* Per-output-batch lifecycle. Loaders are re-started before each batch, fixing
the "Unexpected state: HARVESTED" failure on multi-row/multi-batch input.
* Null/empty input writes an aligned (null) row so the loader row count matches
the surrounding batch.
3bfb116 to
7d3bfee
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DRILL-8239: Convert JSON UDF to EVF
Description
This PR switches the
convert_fromJSONUDFs over to the EVF JsonLoader instead of the old JsonReader. The bulk of the work was in ProjectRecordBatch, which had to learn how to handle more than one complex-writer function per query, top-level JSON scalars and arrays, and multi-row batches — none of which the existing EVF path supported. Tests that broke along the way (TestComplexTypeWriter, TestConvertFunctions, the HTTP UDFs, and others) all pass again.Documentation
No user facing changes.
Testing
Ran unit tests.