FileSystemCache

class omniduct.caches.filesystem.FileSystemCache(**kwargs)[source]

Bases: omniduct.caches.base.Cache

An implementation of Cache that wraps around a FilesystemClient.

Attributes inherited from Duct:
protocol (str): The name of the protocol for which this instance was
created (especially useful if a Duct subclass supports multiple protocols).
name (str): The name given to this Duct instance (defaults to class
name).
host (str): The host name providing the service (will be ‘127.0.0.1’, if
service is port forwarded from remote; use ._host to see remote host).
port (int): The port number of the service (will be the port-forwarded
local port, if relevant; for remote port use ._port).

username (str, bool): The username to use for the service. password (str, bool): The password to use for the service. registry (None, omniduct.registry.DuctRegistry): A reference to a

DuctRegistry instance for runtime lookup of other services.
remote (None, omniduct.remotes.base.RemoteClient): A reference to a
RemoteClient instance to manage connections to remote services.
cache (None, omniduct.caches.base.Cache): A reference to a Cache
instance to add support for caching, if applicable.
connection_fields (tuple<str>, list<str>): A list of instance attributes
to monitor for changes, whereupon the Duct instance should automatically disconnect. By default, the following attributes are monitored: ‘host’, ‘port’, ‘remote’, ‘username’, and ‘password’.
prepared_fields (tuple<str>, list<str>): A list of instance attributes to
be populated (if their values are callable) when the instance first connects to a service. Refer to Duct.prepare and Duct._prepare for more details. By default, the following attributes are prepared: ‘_host’, ‘_port’, ‘_username’, and ‘_password’.

Additional attributes including host, port, username and password are documented inline.

Class Attributes:
AUTO_LOGGING_SCOPE (bool): Whether this class should be used by omniduct
logging code as a “scope”. Should be overridden by subclasses as appropriate.
DUCT_TYPE (Duct.Type): The type of Duct service that is provided by
this Duct instance. Should be overridden by subclasses as appropriate.
PROTOCOLS (list<str>): The name(s) of any protocols that should be
associated with this class. Should be overridden by subclasses as appropriate.
class Type

Bases: enum.Enum

The Duct.Type enum specifies all of the permissible values of Duct.DUCT_TYPE. Also determines the order in which ducts are loaded by DuctRegistry.

__init__(**kwargs)
protocol (str, None): Name of protocol (used by Duct registries to inform
Duct instances of how they were instantiated).
name (str, None): The name to used by the Duct instance (defaults to
class name if not specified).
registry (DuctRegistry, None): The registry to use to lookup remote
and/or cache instance specified by name.
remote (str, RemoteClient): The remote by which the ducted service
should be contacted.

host (str): The hostname of the service to be used by this client. port (int): The port of the service to be used by this client. username (str, bool, None): The username to authenticate with if necessary.

If True, then users will be prompted at runtime for credentials.
password (str, bool, None): The password to authenticate with if necessary.
If True, then users will be prompted at runtime for credentials.
cache(Cache, None): The cache client to be attached to this instance.
Cache will only used by specific methods as configured by the client.
cache_namespace(str, None): The namespace to use by default when writing
to the cache.
FileSystemCache Quirks:

path (str): The top-level path of the cache in the filesystem. fs (FileSystemClient, str): The filesystem client to use as the

datastore of this cache. If not specified, this will default to the local filesystem using LocalFsClient. If specified as a string, and connected to a DuctRegistry, upon first use an attempt will be made to look up a FileSystemClient instance in the registry by this name.
connect()

Connect to the service backing this client.

It is not normally necessary for a user to manually call this function, since when a connection is required, it is automatically created.

Returns:A reference to the current object.
Return type:Duct instance
describe(namespaces=None)

Return a pandas DataFrame showing all keys and their metadata.

Parameters:namespaces (list<str,None>) – The namespaces to which the summary should be restricted.
Returns:
A representation of keys in the cache. Will include
at least the following columns: [‘bytes’, ‘namespace’, ‘key’, ‘created’, ‘last_accessed’]. Any additional metadata for keys will be appended to these columns.
Return type:pandas.DataFrame
disconnect()

Disconnect this client from backing service.

This method is automatically called during reconnections and/or at Python interpreter shutdown. It first calls Duct._disconnect (which should be implemented by subclasses) and then notifies the RemoteClient subclass, if present, to stop port-forwarding the remote service.

Returns:A reference to this object.
Return type:Duct instance
classmethod for_protocol(protocol)

Retrieve a Duct subclass for a given protocol.

Parameters:protocol (str) – The protocol of interest.
Returns:
The appropriate class for the provided,
partially constructed with the protocol keyword argument set appropriately.
Return type:functools.partial object
Raises:DuctProtocolUnknown – If no class has been defined that offers the named protocol.
get(key, namespace=None, serializer=None)

Retrieve the value associated with the nominated key from the cache.

Parameters:
  • key (str) – The key for which value should be retrieved.
  • namespace (str, None) – The namespace to be used.
  • serializer (Serializer) – The Serializer subclass to use for the deserialisation of value from the cache. (default=PickleSerializer)
Returns:

The (appropriately deserialized) object stored in the cache.

Return type:

object

get_bytecount(key, namespace=None)

Retrieve the number of bytes used by a stored key.

This bytecount may or may not include metadata storage, depending on the backend.

Parameters:
  • key (str) – The key for which to extract the bytecount.
  • namespace (str, None) – The namespace to be used.
Returns:

The number of bytes used by the stored value associated with

the nominated key and namespace.

Return type:

int

get_metadata(key, namespace=None)

Retrieve metadata associated with the nominated key from the cache.

Parameters:
  • key (str) – The key for which to extract metadata.
  • namespace (str, None) – The namespace to be used.
Returns:

The metadata associated with this namespace and key.

Return type:

dict

get_total_bytecount(namespaces=None)

Retrieve the total number of bytes used by the cache.

This method iterates over all (nominated) namespaces and the keys therein, summing the result of .get_bytecount(…) on each.

Parameters:namespaces (list<str,None>) – The namespaces to which the bytecount should be restricted.
Returns:The total number of bytes used by the nominated namespaces.
Return type:int
has_key(key, namespace=None)

Check whether the cache as a nominated key.

Parameters:
  • key (str) – The key for which to check existence.
  • namespace (str,None) – The namespace from which to extract all of the keys.
Returns:

Whether the cache has a value for the nominated namespace and

key.

Return type:

bool

has_namespace(namespace=None)

Check whether the cache has the nominated namespace.

Parameters:namespace (str,None) – The namespace for which to check for existence.
Returns:Whether the cache has the nominated namespaces.
Return type:bool
host

The host name providing the service, or ‘127.0.0.1’ if self.remote is not None, whereupon the service will be port-forwarded locally. You can view the remote hostname using duct._host, and change the remote host at runtime using: duct.host = ‘<host>’.

Type:str
is_connected()

Check whether this Duct instances is currently connected.

This method checks to see whether a Duct instance is currently connected. This is performed by verifying that the remote host and port are still accessible, and then by calling Duct._is_connected, which should be implemented by subclasses.

Returns:Whether this Duct instance is currently connected.
Return type:bool
keys(namespace=None)

Collect a list of all the keys present in the nominated namespaces.

Parameters:namespace (str,None) – The namespace from which to extract all of the keys.
Returns:The keys stored in the cache for the nominated namespace.
Return type:list<str>
namespaces

A list of the namespaces stored in the cache.

Type:list <str,None>
password

Some services require authentication in order to connect to the service, in which case the appropriate password can be specified. If True was provided at instantiation, you will be prompted to type your password at runtime when necessary. If False was provided, then None will be returned. You can specify a different password at runtime using: duct.password = ‘<password>’.

Type:str
port

The local port for the service. If self.remote is not None, the port will be port-forwarded from the remote host. To see the port used on the remote host refer to duct._port. You can change the remote port at runtime using: duct.port = <port>.

Type:int
prepare()

Prepare a Duct subclass for use (if not already prepared).

This method is called before the value of any of the fields referenced in self.connection_fields are retrieved. The fields include, by default: ‘host’, ‘port’, ‘remote’, ‘cache’, ‘username’, and ‘password’. Subclasses may add or subtract from these special fields.

When called, it first checks whether the instance has already been prepared, and if not calls _prepare and then records that the instance has been successfully prepared.

prune(namespaces=None, max_age=None, max_bytes=None, total_bytes=None, total_count=None)

Remove keys from the cache in order to satisfy nominated constraints.

Parameters:
  • namespaces (list<str, None>) – The namespaces to consider for pruning.
  • max_age (None, int, timedelta, relativedelta, date, datetime) – The number of days, a timedelta, or a relativedelta, indicating the maximum age of items in the cache (based on last accessed date). Deltas are expected to be positive.
  • max_bytes (None, int) – The maximum number of bytes for each key, allowing the pruning of larger keys.
  • total_bytes (None, int) – The total number of bytes for the entire cache. Keys will be removed from least recently accessed to most recently accessed until the constraint is satisfied. This constraint will be applied after max_age and max_bytes.
  • total_count (None, int) – The maximum number of items to keep in the cache. Keys will be removed from least recently accessed to most recently accessed until the constraint is satisfied. This constraint will be applied after max_age and max_bytes.
reconnect()

Disconnects, and then reconnects, this client.

Note: This is equivalent to duct.disconnect().connect().

Returns:A reference to this object.
Return type:Duct instance
reset()

Reset this Duct instance to its pre-preparation state.

This method disconnects from the service, resets any temporary authentication and restores the values of the attributes listed in prepared_fields to their values as of when Duct.prepare was called.

Returns:A reference to this object.
Return type:Duct instance
set(key, value, namespace=None, serializer=None, metadata=None)

Set the value of a key.

Parameters:
  • key (str) – The key for which value should be stored.
  • value (object) – The value to be stored.
  • namespace (str, None) – The namespace to be used.
  • serializer (Serializer) – The Serializer subclass to use for the serialisation of value into the cache. (default=PickleSerializer)
  • metadata (dict, None) – Additional metadata to be stored with the value in the cache. Values must be serializable via yaml.safe_dump.
set_metadata(key, metadata, namespace=None, replace=False)

Set the metadata associated with a stored key, creating the key if it is missing.

Parameters:
  • key (str) – The key for which value should be stored.
  • metadata (dict, None) – Additional/override metadata to be stored for key in the cache. Values must be serializable via yaml.safe_dump.
  • namespace (str, None) – The namespace to be used.
  • replace (bool) – Whether the provided metadata should entirely replace any existing metadata, or just update it. (default=False)
unset(key, namespace=None)

Remove the nominated key from the cache.

Parameters:
  • key (str) – The key which should be unset.
  • namespace (str, None) – The namespace to be used.
unset_namespace(namespace=None)

Remove an entire namespace from the cache.

Parameters:namespace (str, None) – The namespace to be removed.
username

Some services require authentication in order to connect to the service, in which case the appropriate username can be specified. If not specified at instantiation, your local login name will be used. If True was provided, you will be prompted to type your username at runtime as necessary. If False was provided, then None will be returned. You can specify a different username at runtime using: duct.username = ‘<username>’.

Type:str