Returns the snapshot model for the snapshot specified by the {snapshot_id} parameter. This endpoint takes the same query parameters and returns the same data as the GET /datasets/{dataset_id}/snapshots/{snapshot_id} endpoint.

If you set row_limit > 0, the API returns the specified number of data records (rows) from the snapshot as well1. row_offset + row_limit must be less than 10,000. To obtain more than this, use GET /export/{snapshot_id}.

If you specify a query string, the API returns matching rows from the snapshot, as well as positional information indicating where instances of the specified query string occur (you can use this to highlight text in a search results page). You must set row_limit > 0 when specifying a query string1.

Query parameters

NameTypeDescriptionRequired?
row_sort string Specifies the field used to sort the records (must set row_limit > 0). If you specify row_offset as well, the records are sorted first and then the offset is applied. Prepend the field name with a minus sign (-) to specify descending order (defaults to ascending). false
highlight boolean When true, for rows returned based on a query string, the endpoint sets the returned highlights attribute with positional information indicating where instances of the specified string occur. Defaults to false (highlights = []). false
row_limit integer Number of rows to return (defaults to 0; maximum 10,000)1. false
ancestor string Not used. false
row_offset integer Number of rows to skip at the beginning of the snapshot (for example, row_offset=10 skips the first 10 rows). If you specify row_sort as well, the records are sorted first and then the offset is applied. false
query string Query string to return only rows that contain specific information (must set row_limit > 0)1. false
stats boolean When true, the endpoint returns a stats attribute with statistical information about the snapshot3. Defaults to false. false
query_mode string advanced2 or simple. Defaults to simple. false

1 row_limit only works for snapshots designated as the “current snapshot” within the parent dataset.

2 advanced query mode is not supported.

3 For information about the stats parameter, see Snapshot statistics below.

Try it out

Enter any desired query parameters and click Send to view the response:

GET https://public.enigma.com/api/snapshots/{id}?


Responses

CodeReturns
200 The snapshot model plus any resulting data.
401 Invalid login credentials
404 Requested resource not found
405 Method not allowed

Example

This example returns the snapshot attributes and the first 10 rows containing the string “Apple Inc” from specified snapshot (from the “H-1B Visa Applications - 2015” dataset). The last parameter specifies that rows be sorted by the “Prevailing Wage” column (prevailing_wage) in descending order.

$ curl -X GET 'https://public.enigma.com/api/snapshots/df2d466d-8da4-46bd-bd15-8d5318889f2a?query=Apple%20Inc&row_limit=10&row_sort=-prevailing_wage'

Snapshot statistics

If you include the query parameter stats=true, the API returns statistical information about the snapshot. If you add query=<query_string>, you’ll get statistics for just the matching rows, as shown in the sample output below.

If you use query=, you don’t need to specify row_limit= unless you want the actual rows as well. If you do specify row_limit=, stats will be for all matching rows, not just the rows that are returned.
  "stats": {
    "api_number": {
      "buckets": [
        {
          "doc_count": 11, 
          "key": "608174119700"
        }
      ], 
      "doc_count_error_upper_bound": 0, 
      "fill_rate": 1.0, 
      "sum_other_doc_count": 0
    }, 
    "gas": {
      "avg": 536440.8181818182, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 783719.0, 
      "min": 0.0, 
      "sum": 5900849.0
    }, 
    "month": {
      "avg": 1401604363636.3635, 
      "avg_as_string": "2014-06-01T06:32:43.636Z", 
      "buckets": [
        {
          "doc_count": 1, 
          "key": 1388534400000, 
          "key_as_string": "2014-01-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1391212800000, 
          "key_as_string": "2014-02-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1393632000000, 
          "key_as_string": "2014-03-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1396310400000, 
          "key_as_string": "2014-04-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1398902400000, 
          "key_as_string": "2014-05-01T00:00:00.000Z"
        }
      ], 
      "count": 11, 
      "doc_count_error_upper_bound": 0, 
      "fill_rate": 1.0, 
      "max": 1414800000000.0, 
      "max_as_string": "2014-11-01T00:00:00.000Z", 
      "min": 1388534400000.0, 
      "min_as_string": "2014-01-01T00:00:00.000Z", 
      "sum": 15417648000000.0, 
      "sum_as_string": "2458-07-26T00:00:00.000Z", 
      "sum_other_doc_count": 6
    }, 
    "oil": {
      "avg": 512950.7272727273, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 755074.0, 
      "min": 0.0, 
      "sum": 5642458.0
    }, 
    "serial_ea8cd08a_f55d_4e12_bc35_aa45596c86dc": {
      "avg": 60157.0, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 109953.0, 
      "min": 8466.0, 
      "sum": 661727.0
    }, 
    "serialid": {
      "avg": 60158.0, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 109954.0, 
      "min": 8467.0, 
      "sum": 661738.0
    }, 
    "water": {
      "avg": 131.54545454545453, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 595.0, 
      "min": 0.0, 
      "sum": 1447.0
    }
  }, 

In the JSON response, each child of the stats object represents one column. The information returned for each column depends on the column type:

Type Stats returned Descriptions
integer, decimal avg Mean of non-empty values
  count Number of non-empty values
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
  max Maximum value
  min Minimum value
  sum Sum of all non-empty values
string buckets Top five strings by frequency of occurrence, with the key (string) and doc_count (count) for each
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
datetime buckets Top five datetimes by frequency of occurrence, with the key (datetime), key_as_string (datetime), and doc_count (count) for each
  count Number of non-empty values
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
  max Latest date as milliseconds since January 1, 1970 (midnight UTC/GMT)
  max_as_string Latest date in ISO 8601 YYYY-MM-DDThh:mm:ss.sTZD format
  min Earliest date as milliseconds since January 1, 1970 (midnight UTC/GMT)
  min_as_string Earliest date in ISO 8601 YYYY-MM-DDThh:mm:ss.sTZD format
  sum Sum of dates as milliseconds since January 1, 1970 (midnight UTC/GMT)
boolean avg Mean of non-empty values (true = 1; false = 0)
  avg_as_string True if avg >= 0.5; false otherwise
  buckets Values by frequency of occurrence, with the key (numeric value where 1 = true; 0 = false), key_as_string (boolean value), and doc_count (count) for each
  count Number of non-empty values
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
  max Maximum value (1 or 0)
  max_as_string Maximum value as a string (true or false)
  min Minimum value (1 or 0)
  min_as_string Minimum value as a string (true or false)
  sum Sum of all non-empty values (true = 1; false = 0)