case class HadoopFsRelation(
location: FileIndex,
partitionSchema: StructType,
dataSchema: StructType,
bucketSpec: Option[BucketSpec],
fileFormat: FileFormat,
options: Map[String, String])(val sparkSession: SparkSession)
extends BaseRelation with FileRelation
BaseRelation
BaseRelation works in a SQLContext with a data of a given schema (as StructType). BaseRelation knows its size (as sizeInBytes), whether it needs a conversion, and computes the list of Filter that this data source may not be able to handle.
| Name | Behaviour |
|---|---|
|
Returns the current SQLContext. |
|
Returns the current StructType. |
|
Computes an estimated size of this relation in bytes. |
|
Whether the relation needs a conversion of the objects in |
|
Computes the list of |
|
Note
|
A "data source" and "relation" appear as synonyms. |
BaseRelation is an abstract class in org.apache.spark.sql.sources package.
HadoopFsRelation
HadoopFsRelation is a BaseRelation in a SparkSession (through which it gets to the current SQLContext).
HadoopFsRelation requires a schema (as StructType) that it expands with the input partitionSchema schema.
sizeInBytes and inputFiles (from the base BaseRelation) use the input FileIndex to compute the size and input files, respectively.