Caches

All remote clients are expected to be subclasses of Cache, and so will share a common API. Protocol implementations are also free to add extra methods, which are documented in the “Subclass Reference” section below.

Common API

omniduct.caches.base.cached_method(key, namespace=<function <lambda>>, cache=<function <lambda>>, use_cache=<function <lambda>>, renew=<function <lambda>>, serializer=<function <lambda>>, metadata=<function <lambda>>)[source]

Wrap a method of a Duct class and add caching capabilities.

All arguments of this function are expected to be functions taking two arguments: a reference to current instance of the class (self) and a dictionary of arguments passed to the function (kwargs).

Parameters:
  • key (function -> str) – The key under which the value returned by the wrapped function should be stored.
  • namespace (function -> str) – The namespace under which the key should be stored (default: “<duct class name>.<duct instance name>”).
  • cache (function -> Cache) – The instance of cache via which to store the output of the wrapped function (default: self.cache).
  • use_cache (function -> bool) – Whether or not to use the caching functionality (default: True).
  • renew (function -> bool) – Whether to renew the stored cache, overriding if a value has already been stored (default: False).
  • serializer (function -> Serializer) – The Serializer subclass to use when storing the return object (default: PickleSerializer).
  • metadata (function -> None, dict) – A dictionary of additional metadata to be stored alongside the wrapped function’s output (default: None).
Returns:

The (potentially cached) object returned when calling the

wrapped function.

Return type:

object

Raises:

Exception – If cache fails to store the output of the wrapped function, and the omniduct configuration key cache_fail_hard is True, then the underlying exceptions raised by the Cache instance will be reraised.

class omniduct.caches.base.Cache(**kwargs)[source]

Bases: omniduct.duct.Duct

An abstract class providing the common API for all cache clients.

Attributes inherited from Duct:
protocol (str): The name of the protocol for which this instance was
created (especially useful if a Duct subclass supports multiple protocols).
name (str): The name given to this Duct instance (defaults to class
name).
host (str): The host name providing the service (will be ‘127.0.0.1’, if
service is port forwarded from remote; use ._host to see remote host).
port (int): The port number of the service (will be the port-forwarded
local port, if relevant; for remote port use ._port).

username (str, bool): The username to use for the service. password (str, bool): The password to use for the service. registry (None, omniduct.registry.DuctRegistry): A reference to a

DuctRegistry instance for runtime lookup of other services.
remote (None, omniduct.remotes.base.RemoteClient): A reference to a
RemoteClient instance to manage connections to remote services.
cache (None, omniduct.caches.base.Cache): A reference to a Cache
instance to add support for caching, if applicable.
connection_fields (tuple<str>, list<str>): A list of instance attributes
to monitor for changes, whereupon the Duct instance should automatically disconnect. By default, the following attributes are monitored: ‘host’, ‘port’, ‘remote’, ‘username’, and ‘password’.
prepared_fields (tuple<str>, list<str>): A list of instance attributes to
be populated (if their values are callable) when the instance first connects to a service. Refer to Duct.prepare and Duct._prepare for more details. By default, the following attributes are prepared: ‘_host’, ‘_port’, ‘_username’, and ‘_password’.

Additional attributes including host, port, username and password are documented inline.

Class Attributes:
AUTO_LOGGING_SCOPE (bool): Whether this class should be used by omniduct
logging code as a “scope”. Should be overridden by subclasses as appropriate.
DUCT_TYPE (Duct.Type): The type of Duct service that is provided by
this Duct instance. Should be overridden by subclasses as appropriate.
PROTOCOLS (list<str>): The name(s) of any protocols that should be
associated with this class. Should be overridden by subclasses as appropriate.
__init__(**kwargs)[source]
protocol (str, None): Name of protocol (used by Duct registries to inform
Duct instances of how they were instantiated).
name (str, None): The name to used by the Duct instance (defaults to
class name if not specified).
registry (DuctRegistry, None): The registry to use to lookup remote
and/or cache instance specified by name.
remote (str, RemoteClient): The remote by which the ducted service
should be contacted.

host (str): The hostname of the service to be used by this client. port (int): The port of the service to be used by this client. username (str, bool, None): The username to authenticate with if necessary.

If True, then users will be prompted at runtime for credentials.
password (str, bool, None): The password to authenticate with if necessary.
If True, then users will be prompted at runtime for credentials.
cache(Cache, None): The cache client to be attached to this instance.
Cache will only used by specific methods as configured by the client.
cache_namespace(str, None): The namespace to use by default when writing
to the cache.
set(key, value, namespace=None, serializer=None, metadata=None)[source]

Set the value of a key.

Parameters:
  • key (str) – The key for which value should be stored.
  • value (object) – The value to be stored.
  • namespace (str, None) – The namespace to be used.
  • serializer (Serializer) – The Serializer subclass to use for the serialisation of value into the cache. (default=PickleSerializer)
  • metadata (dict, None) – Additional metadata to be stored with the value in the cache. Values must be serializable via yaml.safe_dump.
set_metadata(key, metadata, namespace=None, replace=False)[source]

Set the metadata associated with a stored key, creating the key if it is missing.

Parameters:
  • key (str) – The key for which value should be stored.
  • metadata (dict, None) – Additional/override metadata to be stored for key in the cache. Values must be serializable via yaml.safe_dump.
  • namespace (str, None) – The namespace to be used.
  • replace (bool) – Whether the provided metadata should entirely replace any existing metadata, or just update it. (default=False)
get(key, namespace=None, serializer=None)[source]

Retrieve the value associated with the nominated key from the cache.

Parameters:
  • key (str) – The key for which value should be retrieved.
  • namespace (str, None) – The namespace to be used.
  • serializer (Serializer) – The Serializer subclass to use for the deserialisation of value from the cache. (default=PickleSerializer)
Returns:

The (appropriately deserialized) object stored in the cache.

Return type:

object

get_bytecount(key, namespace=None)[source]

Retrieve the number of bytes used by a stored key.

This bytecount may or may not include metadata storage, depending on the backend.

Parameters:
  • key (str) – The key for which to extract the bytecount.
  • namespace (str, None) – The namespace to be used.
Returns:

The number of bytes used by the stored value associated with

the nominated key and namespace.

Return type:

int

get_metadata(key, namespace=None)[source]

Retrieve metadata associated with the nominated key from the cache.

Parameters:
  • key (str) – The key for which to extract metadata.
  • namespace (str, None) – The namespace to be used.
Returns:

The metadata associated with this namespace and key.

Return type:

dict

unset(key, namespace=None)[source]

Remove the nominated key from the cache.

Parameters:
  • key (str) – The key which should be unset.
  • namespace (str, None) – The namespace to be used.
unset_namespace(namespace=None)[source]

Remove an entire namespace from the cache.

Parameters:namespace (str, None) – The namespace to be removed.
namespaces

A list of the namespaces stored in the cache.

Type:list <str,None>
has_namespace(namespace=None)[source]

Check whether the cache has the nominated namespace.

Parameters:namespace (str,None) – The namespace for which to check for existence.
Returns:Whether the cache has the nominated namespaces.
Return type:bool
keys(namespace=None)[source]

Collect a list of all the keys present in the nominated namespaces.

Parameters:namespace (str,None) – The namespace from which to extract all of the keys.
Returns:The keys stored in the cache for the nominated namespace.
Return type:list<str>
has_key(key, namespace=None)[source]

Check whether the cache as a nominated key.

Parameters:
  • key (str) – The key for which to check existence.
  • namespace (str,None) – The namespace from which to extract all of the keys.
Returns:

Whether the cache has a value for the nominated namespace and

key.

Return type:

bool

get_total_bytecount(namespaces=None)[source]

Retrieve the total number of bytes used by the cache.

This method iterates over all (nominated) namespaces and the keys therein, summing the result of .get_bytecount(…) on each.

Parameters:namespaces (list<str,None>) – The namespaces to which the bytecount should be restricted.
Returns:The total number of bytes used by the nominated namespaces.
Return type:int
describe(namespaces=None)[source]

Return a pandas DataFrame showing all keys and their metadata.

Parameters:namespaces (list<str,None>) – The namespaces to which the summary should be restricted.
Returns:
A representation of keys in the cache. Will include
at least the following columns: [‘bytes’, ‘namespace’, ‘key’, ‘created’, ‘last_accessed’]. Any additional metadata for keys will be appended to these columns.
Return type:pandas.DataFrame
prune(namespaces=None, max_age=None, max_bytes=None, total_bytes=None, total_count=None)[source]

Remove keys from the cache in order to satisfy nominated constraints.

Parameters:
  • namespaces (list<str, None>) – The namespaces to consider for pruning.
  • max_age (None, int, timedelta, relativedelta, date, datetime) – The number of days, a timedelta, or a relativedelta, indicating the maximum age of items in the cache (based on last accessed date). Deltas are expected to be positive.
  • max_bytes (None, int) – The maximum number of bytes for each key, allowing the pruning of larger keys.
  • total_bytes (None, int) – The total number of bytes for the entire cache. Keys will be removed from least recently accessed to most recently accessed until the constraint is satisfied. This constraint will be applied after max_age and max_bytes.
  • total_count (None, int) – The maximum number of items to keep in the cache. Keys will be removed from least recently accessed to most recently accessed until the constraint is satisfied. This constraint will be applied after max_age and max_bytes.
connect()

Connect to the service backing this client.

It is not normally necessary for a user to manually call this function, since when a connection is required, it is automatically created.

Returns:A reference to the current object.
Return type:Duct instance
disconnect()

Disconnect this client from backing service.

This method is automatically called during reconnections and/or at Python interpreter shutdown. It first calls Duct._disconnect (which should be implemented by subclasses) and then notifies the RemoteClient subclass, if present, to stop port-forwarding the remote service.

Returns:A reference to this object.
Return type:Duct instance
is_connected()

Check whether this Duct instances is currently connected.

This method checks to see whether a Duct instance is currently connected. This is performed by verifying that the remote host and port are still accessible, and then by calling Duct._is_connected, which should be implemented by subclasses.

Returns:Whether this Duct instance is currently connected.
Return type:bool
prepare()

Prepare a Duct subclass for use (if not already prepared).

This method is called before the value of any of the fields referenced in self.connection_fields are retrieved. The fields include, by default: ‘host’, ‘port’, ‘remote’, ‘cache’, ‘username’, and ‘password’. Subclasses may add or subtract from these special fields.

When called, it first checks whether the instance has already been prepared, and if not calls _prepare and then records that the instance has been successfully prepared.

Cache Quirks:

This method may be overridden by subclasses, but provides the following default behaviour:

  • Ensures self.registry, self.remote and self.cache values are instances of the right types.
  • It replaces string values of self.remote and self.cache with remotes and caches looked up using self.registry.lookup.
  • It looks through each of the fields nominated in self.prepared_fields and, if the corresponding value is callable, sets the value of that field to result of calling that value with a reference to self. By default, prepared_fields contains ‘_host’, ‘_port’, ‘_username’, and ‘_password’.
  • Ensures value of self.port is an integer (or None).

Subclass Reference

For comprehensive documentation on any particular subclass, please refer to one of the below documents.