Utils#

Common utils could be used all over a codebase.

class gordo_core.utils.PredictionResult(name, predictions, error_messages)#

Bases: tuple

Prediction result representation.

error_messages#: Alias for field number 2

name#: Alias for field number 0

predictions#: Alias for field number 1

gordo_core.utils.capture_args(method: Callable)[source]#: capture_args_ext() without arguments

gordo_core.utils.capture_args_ext(ignore: Iterable[str] | None = None)[source]#

Decorator that captures args and kwargs passed to a given method. This assumes the decorated method has a self, which has a dict of kwargs assigned as an attribute named _params.

Parameters:: ignore – List of arguments that need to be ignored during capturing
Return type:: Returns whatever the original method would return

gordo_core.utils.fill_series_with_look_back_points(series: Series, look_back_point: Series, end_time: datetime, interpolation_limit: str, resolution: str | Timedelta) → Series[source]#

Fill of Nans of given Series with interpolated look-back points.

Parameters:

series – Series for Nans filling.
look_back_point – Series with one closest point (with time) in past. Contains only one point.
end_time – latest time of the Series till what points might be filled.
interpolation_limit – time limit for interpolation.
resolution – resolution of DatetimeIndex of the Series.

Returns:

Newly copied and filled with Nans (if possible) Series.
Given Series is not affected.

gordo_core.utils.find_gaps(values: Series | DataFrame, resolution: str, resampling_startpoint: datetime, resampling_endpoint: datetime) → DataFrame[source]#: Find the gaps in a series’s index and create a dataframe to store them, with columns: start, end

gordo_core.utils.gaps_df_to_dict(df: DataFrame) → dict[source]#

Extract metadata information from find_gaps() DataFrame

Return type:: Contains columns start, end

gordo_core.utils.get_version() → str | None[source]#

gordo_core.utils.influx_client_from_uri(uri: str, api_key: str | None = None, api_key_header: str | None = 'Ocp-Apim-Subscription-Key', recreate: bool = False, dataframe_client: bool = False, proxies: Mapping[str, str] = mappingproxy({'https': '', 'http': ''})) → InfluxDBClient | DataFrameClient[source]#

Get a InfluxDBClient or DataFrameClient from a SqlAlchemy like URI

Parameters:

uri – Connection string format: <username>:<password>@<host>:<port>/<optional-path>/<db_name>
api_key – Any api key required for the client connection
api_key_header – The name of the header the api key should be assigned
recreate – Re/create the database named in the URI
dataframe_client – Return a DataFrameClient instead of a standard InfluxDBClient
proxies – A mapping of any proxies to pass to the influx client

gordo_core.utils.resample(series: Series, resampling_startpoint: datetime, resampling_endpoint: datetime, resolution: str, interpolation_limit: str, aggregation_methods: str | list[str] | Callable = 'mean', interpolation_method: str = 'linear_interpolation')[source]#: Resample series accordingly to given parameters. Takes a single series and resamples it.

Note

Nans are NOT dropped in this function anymore after resampling.

Helper functions for module imports.

gordo_core.import_utils.import_location(location: str, *, import_path: str | None = None, back_compatibles: dict[tuple[Optional[str], str], tuple[Optional[str], str]] | None = None) → Any[source]#

Imports entity from provided location, or finds an entity with location name in import_path module.

Example

>>> import_location("multiprocessing.Process")
<class 'multiprocessing.context.Process'>
>>> import_location("Process", import_path="multiprocessing")
<class 'multiprocessing.context.Process'>

Parameters:

location – Import location. Could be either a full import path or just a class name.
import_path – Should be provided if location contains only the class name.
back_compatibles – See prepare_back_compatible_locations() function for reference.

Return type:

Imported entity.

gordo_core.import_utils.prepare_back_compatible_locations(locations: Iterable[tuple[str, str]]) → dict[tuple[Optional[str], str], tuple[Optional[str], str]][source]#

The result of this function can be used as back_compatibles argument in import_location() functions.

Example

>>> prepare_back_compatible_locations([('old_module.MyClass', 'new_module.MyClass'),('OldClass', 'NewClass')])
{('old_module', 'MyClass'): ('new_module', 'MyClass'), (None, 'OldClass'): (None, 'NewClass')}

Result items in this example are tuple[str, str] where the first item is a module location and the second is a class name.

Parameters:: locations – List of locations. The first item of each tuple is a location in the previous version, the second item is the location of the current version.
Return type:: Key/Value pair with locations of the previous version to the current version.

This module contains a list of broken after the latest refactoring import paths. See also gordo_core.import_utils

gordo_core.back_compatibles.DEFAULT_BACK_COMPATIBLES: Final[dict[tuple[Optional[str], str], tuple[Optional[str], str]]] = {(None, 'TimeSeriesDataset'): ('gordo_dataset.time_series', 'TimeSeriesDataset'), ('gordo_dataset.datasets', 'TimeSeriesDataset'): ('gordo_dataset.time_series', 'TimeSeriesDataset'), (None, 'RandomDataset'): ('gordo_dataset.time_series', 'RandomDataset'), ('gordo_dataset.datasets', 'RandomDataset'): ('gordo_dataset.time_series', 'RandomDataset'), (None, 'DataLakeProvider'): ('gordo_dataset.data_providers.dl.providers', 'DataLakeProvider'), ('gordo_dataset.data_provider.providers', 'DataLakeProvider'): ('gordo_dataset.data_providers.dl.providers', 'DataLakeProvider'), (None, 'InfluxDataProvider'): ('gordo_dataset.data_providers.providers', 'InfluxDataProvider'), ('gordo_dataset.data_provider.providers', 'InfluxDataProvider'): ('gordo_dataset.data_providers.providers', 'InfluxDataProvider'), (None, 'RandomDataProvider'): ('gordo_dataset.data_providers.providers', 'RandomDataProvider'), ('gordo_dataset.data_provider.providers', 'RandomDataProvider'): ('gordo_dataset.data_providers.providers', 'RandomDataProvider'), ('gordo_dataset.base', 'GordoBaseDataset'): ('gordo_core.base', 'GordoBaseDataset'), ('gordo_dataset.time_series', 'TimeSeriesDataset'): ('gordo_core.time_series', 'TimeSeriesDataset'), ('gordo_dataset.time_series', 'RandomDataset'): ('gordo_core.time_series', 'RandomDataset'), ('gordo_dataset.data_providers.base', 'GordoBaseDataProvider'): ('gordo_core.data_providers.base', 'GordoBaseDataProvider'), ('gordo_dataset.data_providers.providers', 'RandomDataProvider'): ('gordo_core.data_providers.providers', 'RandomDataProvider'), ('gordo_dataset.data_providers.providers', 'InfluxDataProvider'): ('gordo_core.data_providers.providers', 'InfluxDataProvider')}#: This constant have to be used as default value for back_compatibles argument for gordo_core.import_utils.import_location(). Most of these paths are temporary and will be deprecated soon.

Class attribute validators. Mainly used in classes extended from gordo_core.base.GordoBaseDataset.

class gordo_core.validators.BaseDescriptor[source]#

Bases: object

Base descriptor class

New object should override __set__(self, instance, value) method to check if ‘value’ meets required needs.

class gordo_core.validators.ValidDataProvider[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring type gordo_core.data_providers.base.GordoBaseDataProvider

class gordo_core.validators.ValidDataset[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring type gordo_core.base.GordoBaseDataset

class gordo_core.validators.ValidDatasetKwargs[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring type gordo_core.base.GordoBaseDataset

class gordo_core.validators.ValidDatetime[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring valid datetime.datetime attribute

class gordo_core.validators.ValidTagList[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring a non-empty list of strings

Date partitions helper functions.

Todo

Move module gordo_core.data_providers.partition to gordo_core.partition

class gordo_core.data_providers.partition.MonthPartition(year: int, month: int)[source]#

Bases: object

month: int#

year: int#

class gordo_core.data_providers.partition.PartitionBy(value)[source]#

Bases: Enum

An enumeration.

MONTH = 'month'#

YEAR = 'year'#

classmethod find_by_name(name) → PartitionBy | None[source]#

class gordo_core.data_providers.partition.YearPartition(year: int)[source]#

Bases: object

year: int#

gordo_core.data_providers.partition.split_by_partitions(partition_by: PartitionBy, start_period: datetime, end_period: datetime) → Iterable[YearPartition | MonthPartition][source]#

Split time span by partitions

Parameters:

partition_by – Partition chunks size, either year or month.
start_period – First date of time span.
end_period – Last date of time span.

Data provider utils.

gordo_core.data_providers.utils.build_dir_path(storage: FileSystem, base_dir: str, field_values: Iterable[Tuple[str, str]]) → str[source]#: Deprecated since version 0.3.0: Will be removed.

gordo_core.data_providers.utils.partition_dir_name(field: str, value: str)[source]#: Deprecated since version 0.3.0: Will be removed.