pardata.get_dataset_metadata
- pardata.get_dataset_metadata(name, *, version='latest')
Return a dataset’s metadata either in human-readable form or as a copy of its schema.
- Parameters
name (str) – Name of the dataset you want get the metadata of. You can get a list of these datasets by calling
list_all_datasets()
.version (str) – Version of the dataset to load. Latest version is used by default. You can get a list of all available versions for a dataset by calling
list_all_datasets()
.
- Returns
A dataset’s metadata.
- Return type
Dict[str, Any]
Example:
>>> import pprint >>> metadata = get_dataset_metadata('gmb') >>> metadata['name'] 'Groningen Meaning Bank Modified' >>> metadata['description'] 'A dataset of multi-sentence texts, together with annotations for parts-of-speech... >>> pprint.pprint(metadata['subdatasets']) {'gmb_subset_full': {'description': 'A full version of the raw dataset. Used ' 'to train MAX model – Named Entity Tagger.', 'format': 'text/plain', 'name': 'GMB Subset Full', 'path': 'groningen_meaning_bank_modified/gmb_subset_full.txt'}}