Datasets group multiple versions of the same data (“snapshots”) and associate them with relevant metadata that remains constant from snapshot to snapshot. A dataset does not need to have a snapshot in order to exist; however, it will not be visible in the user interface without one. Every snapshot must be associated with a dataset.

Dataset attributes

ancestors array An ordered array of collections describing the hierarchy of collections that contain this dataset. The first item in this array is the topmost collection in the hierarchy, and the last item is the immediate parent of the dataset. READ
citation string Identifies the source of the dataset.
created_at string <date-time> Date-time stamp indicating when the dataset was created. READ
current_snapshot object<Snapshot> The snapshot attributes2 for dataset’s most recent (“current”) snapshot. READ
data_updated_at string <date-time> A date-time stamp indicating the last time the data changed (not just a new snapshot with identical data). READ
dependencies array Administrative metadata4.
derivatives array Administrative metadata4.
description string Description of the dataset (displayed in the Assembly client interface).
description_short string Short description of the dataset.
display_name string Human-readable name for the dataset (used to refer to it in lists and elsewhere in the user interface). REQUIRED
editable boolean true if administrators can edit the collection metadata through the user interface; false otherwise. READ
extraction_type string Administrative metadata4.
highlights array For datasets returned based on a search string, this provides positional information indicating where instances of the specified query string occur. READ
id string <uuid> The dataset’s unique identifier. This never changes.
key_value object A dictionary object containing key-value pairs (used for custom metadata)3.
modified_at string <date-time> A date-time stamp indicating the last time the dataset was modified. READ
one_time_ingestion boolean Administrative metadata4.
parent_collection object The collection that directly contains this dataset.
published boolean true if this collection is visible to non-admin users; false otherwise1.
resource_owner string Administrative metadata4.
resource_owner_contact string Administrative metadata4.
schema_updated_at string <date-time> A date-time stamp indicating the last time the schema changed (not just a new snapshot with an identical schema). READ
score number A relevance score assigned when using the query parameter to search for matching collections. READ
source_description string Administrative metadata4.
source_lag_time object Administrative metadata4.
source_type string Administrative metadata4.
source_update_schedule object Administrative metadata4.
tags array The tags assigned to this dataset5.
technical_owner string Administrative metadata4.
technical_owner_contact string Administrative metadata4.
terms_of_service string Administrative metadata4.

1 It’s possible to have an access control policy on a parent collection that makes unpublished datasets visible to non-admin users.

2 See Snapshot attributes for details.

3 key_value supports custom metadata. Any custom metadata is displayed under the “Custom fields” heading in the dataset inspector pane in the Enigma Public user interface.

4 Administrative metadata provides additional information about the dataset. See Administrative metadata below.

5 The tags attribute is not yet supported.

Sample dataset model

Administrative metadata

Administrative metadata provides additional information about the dataset. Most of it is added by Enigma Public administrators. It can tell you, for example, the source_update_schedule (how often the data is refreshed, for example, weekly or monthly) and the source_lag_time (for example, if GDP data is not published until 3 months after the period being measured, then the lag time is 3 months).