Utils#

Common utils could be used all over a codebase.

class gordo_core.utils.PredictionResult(name, predictions, error_messages)#

Bases: tuple

Prediction result representation.

error_messages#

Alias for field number 2

name#

Alias for field number 0

predictions#

Alias for field number 1

gordo_core.utils.capture_args(method: Callable)[source]#

capture_args_ext() without arguments

gordo_core.utils.capture_args_ext(ignore: Iterable[str] | None = None)[source]#

Decorator that captures args and kwargs passed to a given method. This assumes the decorated method has a self, which has a dict of kwargs assigned as an attribute named _params.

Parameters:

ignore – List of arguments that need to be ignored during capturing

Return type:

Returns whatever the original method would return

gordo_core.utils.fill_series_with_look_back_points(series: Series, look_back_point: Series, end_time: datetime, interpolation_limit: str, resolution: str | Timedelta) Series[source]#

Fill of Nans of given Series with interpolated look-back points.

Parameters:
  • series – Series for Nans filling.

  • look_back_point – Series with one closest point (with time) in past. Contains only one point.

  • end_time – latest time of the Series till what points might be filled.

  • interpolation_limit – time limit for interpolation.

  • resolution – resolution of DatetimeIndex of the Series.

Returns:

  • Newly copied and filled with Nans (if possible) Series.

  • Given Series is not affected.

gordo_core.utils.find_gaps(values: Series | DataFrame, resolution: str, resampling_startpoint: datetime, resampling_endpoint: datetime) DataFrame[source]#

Find the gaps in a series’s index and create a dataframe to store them, with columns: start, end

gordo_core.utils.gaps_df_to_dict(df: DataFrame) dict[source]#

Extract metadata information from find_gaps() DataFrame

Return type:

Contains columns start, end

gordo_core.utils.get_version() str | None[source]#
gordo_core.utils.influx_client_from_uri(uri: str, api_key: str | None = None, api_key_header: str | None = 'Ocp-Apim-Subscription-Key', recreate: bool = False, dataframe_client: bool = False, proxies: Mapping[str, str] = mappingproxy({'https': '', 'http': ''})) InfluxDBClient | DataFrameClient[source]#

Get a InfluxDBClient or DataFrameClient from a SqlAlchemy like URI

Parameters:
  • uri – Connection string format: <username>:<password>@<host>:<port>/<optional-path>/<db_name>

  • api_key – Any api key required for the client connection

  • api_key_header – The name of the header the api key should be assigned

  • recreate – Re/create the database named in the URI

  • dataframe_client – Return a DataFrameClient instead of a standard InfluxDBClient

  • proxies – A mapping of any proxies to pass to the influx client

gordo_core.utils.resample(series: Series, resampling_startpoint: datetime, resampling_endpoint: datetime, resolution: str, interpolation_limit: str, aggregation_methods: str | list[str] | Callable = 'mean', interpolation_method: str = 'linear_interpolation')[source]#

Resample series accordingly to given parameters. Takes a single series and resamples it.

Note

Nans are NOT dropped in this function anymore after resampling.

Helper functions for module imports.

gordo_core.import_utils.import_location(location: str, *, import_path: str | None = None, back_compatibles: dict[tuple[Optional[str], str], tuple[Optional[str], str]] | None = None) Any[source]#

Imports entity from provided location, or finds an entity with location name in import_path module.

Example

>>> import_location("multiprocessing.Process")
<class 'multiprocessing.context.Process'>
>>> import_location("Process", import_path="multiprocessing")
<class 'multiprocessing.context.Process'>
Parameters:
  • location – Import location. Could be either a full import path or just a class name.

  • import_path – Should be provided if location contains only the class name.

  • back_compatibles – See prepare_back_compatible_locations() function for reference.

Return type:

Imported entity.

gordo_core.import_utils.prepare_back_compatible_locations(locations: Iterable[tuple[str, str]]) dict[tuple[Optional[str], str], tuple[Optional[str], str]][source]#

The result of this function can be used as back_compatibles argument in import_location() functions.

Example

>>> prepare_back_compatible_locations([('old_module.MyClass', 'new_module.MyClass'),('OldClass', 'NewClass')])
{('old_module', 'MyClass'): ('new_module', 'MyClass'), (None, 'OldClass'): (None, 'NewClass')}

Result items in this example are tuple[str, str] where the first item is a module location and the second is a class name.

Parameters:

locations – List of locations. The first item of each tuple is a location in the previous version, the second item is the location of the current version.

Return type:

Key/Value pair with locations of the previous version to the current version.

This module contains a list of broken after the latest refactoring import paths. See also gordo_core.import_utils

gordo_core.back_compatibles.DEFAULT_BACK_COMPATIBLES: Final[dict[tuple[Optional[str], str], tuple[Optional[str], str]]] = {(None, 'TimeSeriesDataset'): ('gordo_dataset.time_series', 'TimeSeriesDataset'), ('gordo_dataset.datasets', 'TimeSeriesDataset'): ('gordo_dataset.time_series', 'TimeSeriesDataset'), (None, 'RandomDataset'): ('gordo_dataset.time_series', 'RandomDataset'), ('gordo_dataset.datasets', 'RandomDataset'): ('gordo_dataset.time_series', 'RandomDataset'), (None, 'DataLakeProvider'): ('gordo_dataset.data_providers.dl.providers', 'DataLakeProvider'), ('gordo_dataset.data_provider.providers', 'DataLakeProvider'): ('gordo_dataset.data_providers.dl.providers', 'DataLakeProvider'), (None, 'InfluxDataProvider'): ('gordo_dataset.data_providers.providers', 'InfluxDataProvider'), ('gordo_dataset.data_provider.providers', 'InfluxDataProvider'): ('gordo_dataset.data_providers.providers', 'InfluxDataProvider'), (None, 'RandomDataProvider'): ('gordo_dataset.data_providers.providers', 'RandomDataProvider'), ('gordo_dataset.data_provider.providers', 'RandomDataProvider'): ('gordo_dataset.data_providers.providers', 'RandomDataProvider'), ('gordo_dataset.base', 'GordoBaseDataset'): ('gordo_core.base', 'GordoBaseDataset'), ('gordo_dataset.time_series', 'TimeSeriesDataset'): ('gordo_core.time_series', 'TimeSeriesDataset'), ('gordo_dataset.time_series', 'RandomDataset'): ('gordo_core.time_series', 'RandomDataset'), ('gordo_dataset.data_providers.base', 'GordoBaseDataProvider'): ('gordo_core.data_providers.base', 'GordoBaseDataProvider'), ('gordo_dataset.data_providers.providers', 'RandomDataProvider'): ('gordo_core.data_providers.providers', 'RandomDataProvider'), ('gordo_dataset.data_providers.providers', 'InfluxDataProvider'): ('gordo_core.data_providers.providers', 'InfluxDataProvider')}#

This constant have to be used as default value for back_compatibles argument for gordo_core.import_utils.import_location(). Most of these paths are temporary and will be deprecated soon.

Class attribute validators. Mainly used in classes extended from gordo_core.base.GordoBaseDataset.

class gordo_core.validators.BaseDescriptor[source]#

Bases: object

Base descriptor class

New object should override __set__(self, instance, value) method to check if ‘value’ meets required needs.

class gordo_core.validators.ValidDataProvider[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring type gordo_core.data_providers.base.GordoBaseDataProvider

class gordo_core.validators.ValidDataset[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring type gordo_core.base.GordoBaseDataset

class gordo_core.validators.ValidDatasetKwargs[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring type gordo_core.base.GordoBaseDataset

class gordo_core.validators.ValidDatetime[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring valid datetime.datetime attribute

class gordo_core.validators.ValidTagList[source]#

Bases: BaseDescriptor

Descriptor for attributes requiring a non-empty list of strings

Date partitions helper functions.

Todo

Move module gordo_core.data_providers.partition to gordo_core.partition

class gordo_core.data_providers.partition.MonthPartition(year: int, month: int)[source]#

Bases: object

month: int#
year: int#
class gordo_core.data_providers.partition.PartitionBy(value)[source]#

Bases: Enum

An enumeration.

MONTH = 'month'#
YEAR = 'year'#
classmethod find_by_name(name) PartitionBy | None[source]#
class gordo_core.data_providers.partition.YearPartition(year: int)[source]#

Bases: object

year: int#
gordo_core.data_providers.partition.split_by_partitions(partition_by: PartitionBy, start_period: datetime, end_period: datetime) Iterable[YearPartition | MonthPartition][source]#

Split time span by partitions

Parameters:
  • partition_by – Partition chunks size, either year or month.

  • start_period – First date of time span.

  • end_period – Last date of time span.

Data provider utils.

gordo_core.data_providers.utils.build_dir_path(storage: FileSystem, base_dir: str, field_values: Iterable[Tuple[str, str]]) str[source]#

Deprecated since version 0.3.0: Will be removed.

gordo_core.data_providers.utils.partition_dir_name(field: str, value: str)[source]#

Deprecated since version 0.3.0: Will be removed.