sparkSession.sharedState.cacheManager
CacheManager
— In-Memory Cache for Cached Tables
CacheManager
is an in-memory cache for cached tables (as logical plans). It uses the internal cachedData collection of CachedData to track logical plans and their cached InMemoryRelation representation.
CacheManager
is shared across SparkSessions though SharedState.
cachedData
Internal Registry
cachedData
is a collection of CachedData with logical plans and their cached InMemoryRelation representation.
A new CachedData is added when a Dataset is cached and removed when a Dataset is uncached or when invalidating cache data with a resource path.
invalidateCachedPath
Method
Caution
|
FIXME |
invalidateCache
Method
Caution
|
FIXME |
lookupCachedData
Method
Caution
|
FIXME |
uncacheQuery
Method
Caution
|
FIXME |
isEmpty
Method
Caution
|
FIXME |
Caching Dataset — cacheQuery
Method
cacheQuery(
query: Dataset[_],
tableName: Option[String] = None,
storageLevel: StorageLevel = MEMORY_AND_DISK): Unit
cacheQuery
obtains analyzed logical plan and saves it as a InMemoryRelation in the internal cachedData
cached queries collection.
If however the query has already been cached, you should instead see the following WARN message in the logs:
WARN CacheManager: Asked to cache already cached data.
Removing All Cached Tables From In-Memory Cache — clearCache
Method
clearCache(): Unit
clearCache
acquires a write lock and unpersists RDD[CachedBatch]
s of the queries in cachedData before removing them altogether.
Note
|
clearCache is executed when the CatalogImpl is requested to clearCache.
|
CachedData
Caution
|
FIXME |