geometamaker.geometamaker module
- geometamaker.geometamaker.describe(source_dataset_path, compute_stats=False)[source]
Create a metadata resource instance with properties of the dataset.
Properties of the dataset are used to populate as many metadata properties as possible. Default/placeholder values are used for properties that require user input.
- Parameters:
source_dataset_path (string) – path or URL to dataset to which the metadata applies
compute_stats (bool) – whether to compute statistics for each band in a raster.
- Returns:
a metadata object
- Return type:
- Raises:
ValueError if the file type or protocol of the dataset is not supported. –
FileNotFoundError if the path does not exist. –
- geometamaker.geometamaker.describe_archive(source_dataset_path, scheme, **kwargs)[source]
Describe file properties of an archive file.
- Parameters:
source_dataset_path (str) – path to a file.
scheme (str) – the protocol prefix of the filepath
kwargs (dict) – additional options when describing a dataset.
- Returns:
dict
- geometamaker.geometamaker.describe_collection(directory, depth=32767, exclude_regex=None, exclude_hidden=True, describe_files=False, backup=True, target_filename=None, **kwargs)[source]
Create a single metadata document to describe a collection of files.
Describe all the files within a directory as members of a “collection”. The resulting metadata resource should include a list of all the files included in the collection along with a description and metadata filepath (or placeholder). Optionally create individual metadata files for each supported file in a directory.
- Parameters:
directory (str) – path to collection
depth (int, optional) – maximum number of subdirectory levels to traverse when walking through
directoryto find files included in the collection. A value of 1 limits the walk to files in the top-leveldirectoryonly. A value of 2 allows descending into immediate subdirectories, etc. All files in all subdirectories in the collection will be included by default.exclude_regex (str, optional) – a regular expression to pattern-match any files you do not want included in the output metadata yml.
exclude_hidden (bool, default True) – whether to exclude hidden files (files that start with “.”).
describe_files (bool, default False) – whether to
describeall files, i.e., create individual metadata files for each supported resource in the collection.backup (bool) – whether to write a backup of a pre-existing metadata file before ovewriting it in cases where that file is not a valid geometamaker document.
kwargs (dict) – optional keyward arguments accepted by
describe.
- Returns:
Collection metadata
- geometamaker.geometamaker.describe_file(source_dataset_path, scheme)[source]
Describe basic properties of a file.
- Parameters:
source_dataset_path (str) – path to a file.
scheme (str) – the protocol prefix of the filepath
- Returns:
dict
- geometamaker.geometamaker.describe_raster(source_dataset_path, scheme, **kwargs)[source]
Describe properties of a GDAL raster file.
- Parameters:
source_dataset_path (str) – path to a GDAL raster.
scheme (str) – the protocol prefix of the filepath
kwargs (dict) –
additional options when describing a dataset: *
'compute_stats'(bool): whether to compute statisticsfor each band in the raster. Default is False.
- Returns:
dict
- geometamaker.geometamaker.describe_table(source_dataset_path, scheme, **kwargs)[source]
Describe properties of a tabular dataset.
- Parameters:
source_dataset_path (str) – path to a file representing a table.
scheme (str) – the protocol prefix of the filepath
kwargs (dict) – additional options when describing a dataset.
- Returns:
dict
- Raises:
ValueError if the file cannot be read as a table. –
- geometamaker.geometamaker.describe_vector(source_dataset_path, scheme, **kwargs)[source]
Describe properties of a GDAL vector file.
- Parameters:
source_dataset_path (str) – path to a GDAL vector.
scheme (str) – the protocol prefix of the filepath
kwargs (dict) – additional options when describing a dataset.
- Returns:
dict
- geometamaker.geometamaker.detect_file_type(filepath, scheme)[source]
Detect the type of resource contained in the file.
- Parameters:
filepath (str) – path to a file
scheme (str) – the protocol prefix of the filepath
- Returns:
str
- Raises:
ValueError on unsupported file formats. –
- geometamaker.geometamaker.validate(filepath)[source]
Validate a YAML metadata document.
Validation includes type-checking of property values and checking for the presence of required properties.
- Parameters:
directory (string) – path to a YAML file
- Returns:
pydantic.ValidationError
- Raises:
ValueError if the YAML document is not a geometamaker metadata doc. –
- geometamaker.geometamaker.validate_dir(directory, depth=32767)[source]
Validate all compatible yml documents in the directory.
- Parameters:
directory (string) – path to a directory
depth (int) – maximum number of subdirectory levels to traverse when walking through
directory.
- Returns:
- a list of the filepaths that were validated and
an equal-length list of the validation messages.
- Return type:
tuple (list, list)