def missingInput: AttributeSet
QueryPlan — Structured Query Plan
QueryPlan is a part of Catalyst to model a tree of relational operators, i.e. a query.
Scala-specific, QueryPlan is an abstract class that is the base class of LogicalPlan and SparkPlan (for logical and physical plans, respectively).
A QueryPlan has an output attributes (that serves as the base for the schema), a collection of expressions and a schema.
QueryPlan has statePrefix that is used when displaying a plan with ! to indicate an invalid plan, and ' to indicate an unresolved plan.
A QueryPlan is invalid if there are missing input attributes and children subnodes are non-empty.
A QueryPlan is unresolved if the column names have not been verified and column types have not been looked up in the Catalog.
Missing Input Attributes — missingInput Property
missingInput are attributes that are referenced in expressions but not provided by this node’s children (as inputSet) and are not produced by this node (as producedAttributes).
Query Output Schema — schema Property
You can request the schema of a QueryPlan using schema that builds StructType from the output attributes.
// the query
val dataset = spark.range(3)
scala> dataset.queryExecution.analyzed.schema
res6: org.apache.spark.sql.types.StructType = StructType(StructField(id,LongType,false))
Output Schema — output Property
def output: Seq[Attribute]
output property are attributes that represent the result of a projection in a query that is later used to build a schema.
|
Note
|
output property is also called output schema or result schema.
|
You can access the output schema through a LogicalPlan.
// the query
val dataset = spark.range(3)
scala> dataset.queryExecution.analyzed.output
res0: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)
scala> dataset.queryExecution.withCachedData.output
res1: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)
scala> dataset.queryExecution.optimizedPlan.output
res2: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)
scala> dataset.queryExecution.sparkPlan.output
res3: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)
scala> dataset.queryExecution.executedPlan.output
res4: Seq[org.apache.spark.sql.catalyst.expressions.Attribute] = List(id#0L)
You can build a StructType from output collection of attributes using toStructType method (that is available through the implicit class AttributeSeq).
scala> dataset.queryExecution.analyzed.output.toStructType
res5: org.apache.spark.sql.types.StructType = StructType(StructField(id,LongType,false))