Please use a different browser to view the API documentation. Internet Explorer is not supported.

Enigma Public API Reference

API endpoints

Returns

The collection model for each collection.

Examples

Returns

The collection model for the specified collection.

Examples

Examples

Using the filter parameter

The examples below show how to use the filter parameter. Note that display_name, created_at, and data_updated_at are the only attributes currently supported:

  • filter=display_name=PhilGEPS%20-%20Awards returns only datasets named “PhilGEPS - Awards.” It works for exact matches only.
  • filter=display_name<G returns only datasets with names with first letter less than ‘G’.
  • filter=data_updated_at>2017-11-29 returns datasets with snapshots updated since the specified date.

Using the sort parameter

The sort parameter determines the order in which datasets are returned by the API. The default is by display_name in ascending order, but you can also sort by created_at, data_updated_at, and schema_updated_at. To sort in descending order, include a minus sign, for example, sort='-created_at'.

For datasets that haven’t been updated, data_updated_at and schema_updated_at will be null. Null values are returned as “the most recent,” so if you want to sort by -data_updated_at or -schema_updated_at to obtain the most recently updated datasets, you’ll need to filter out those with null values. You can do this using the filter parameter with the value >1970, for example ?sort=-data_updated_at&filter=data_updated_at>1970.

Using the query parameter

The query parameter lets you specify a search string. If you do, the API returns only rows containing the words you specify. Note that you must set row_limit to a value greater than 0 to get data records in the response.

If you specify multiple words, the default operator is AND, meaning the API returns rows containing all of the words specified. You can include the same advanced search operators supported by the Enigma Public UI to perform other operations, for example, query=enigma||technoligies~1&row_limit=10.

Returns

The dataset model for each dataset.

Examples

Returns

The dataset model for the specified dataset.

Examples

Returns

The snapshot model for each snapshot.

Examples

Returns

The snapshot model for each snapshot.

Examples

Examples

Returns

The snapshot model for the specified snapshot.

Examples

Location-based searches

The geo_query parameter lets you perform location-based searches on snapshots that include location information in geo_point format, where geo_point is a comma-separated latitude/longitute pair (for example, 40.848387, -73.456015).

In the example below, the “Geo Location” column (field name = geo_location) is defined in the snapshot schema as a geo_point field and the values are supplied in the required lat,lng format:

Since the geo_location field is of type geo_point, you can use the geo_query string to locate snapshot rows that relate to a specific location. In the geo_query string, you need to specify:

  • The column with the geocoded location information
  • The center point from which to start the search
  • The search radius, in one of the supported distance units:

    Unit Abbreviation or name
    Mile mi or miles
    Yard yd or yards
    Feet ft or feet
    Inch in or inch
    Kilometer km or kilometers
    Meter m or meters
    Centimeter cm or centimeters
    Millimeter mm or millimeters
    Nautical mile NM, nmi, or nauticalmiles
Grade Violations Food Type Geo Location
A 06D 10F Bakery 40.848387,-73.456015
A 04H 10F French 40.662893,-73.361824
A 04H 10F Italian 40.767745,-73.384869
A 06C 10B 10F American 40.579554,-73.782198
  {
    "data_type": "geo_point", 
    "description": "", 
    "display_name": "Geo Location", 
    "is_serialid": false, 
    "name": "geo_location", 
    "visible_by_default": true
  }
  "rows": [
      [
          "57",
          "40364668",
          "Marchis Restaurant",
          "Manhattan",
          "251 E 31st St",
          "10016",
          ...
          "40.74341",
          "-73.97829",
          {
              "lat": 40.74341,
              "lng": -73.97829
          }
      ],
      ...

Advanced searches

The advanced query mode lets you search specific columns, rather than entire rows. As with simple query mode, you must set row_limit greater than zero to get back any resulting rows. Search terms are not case sensitive.

To search a specific column, use the query parameter to specify the field name, followed by a colon (:) and the search term (see Example 1).

To search for a phrase, put the phrase in quotes and use URL encoded space characters (%20) within the quoted string (see Example 2).

To search for a partial match, use the * wildcard character (see Example 3).

To search for multiple terms within the same column, use parentheses and the appropriate logical operator (AND or OR) (see Example 4).

Within the parentheses, the default operator is AND, so in job_title:(software%20engineer) the AND operator is implied.

To exclude a term, use a minus (-) character (NOT) (see Example 5).

For numeric columns, you can specify a range of values using URL encoded square brackets (%5B %5D) and the keyword “TO” (uppercase). You must include a URL encoded space character (%20) before and after “TO” (see Example 6). This example specifies the range [2 TO 10].

For date columns, ranges use the same [<start> TO <end>] syntax, with dates specified in YYYY-MM-DD format (see Example 7).

When specifying a range, you can use * as the range start or range end to specify <= or >= respectively (see Example 8). This example specifies the range >= 5.

To search multiple columns, put the search criteria for each column within parentheses and include the appropriate logical operator (AND or OR) (see Examples 9 and 10).

Example 1
?query_mode=advanced&query=employer_name:apple&row_limit=10

Example 2
?query_mode=advanced&query=employer_name:"apple%20inc."&row_limit=10

Example 3
?query_mode=advanced&query=employer_name:goog*&row_limit=10

Example 4
?query_mode=advanced&query=job_title:(software%20AND%20engineer)&row_limit=10

Example 5
?query_mode=advanced&query=job_title:(software%20AND%20-engineer)&row_limit=10

Example 6
?query_mode=advanced&query=total_workers:%5B2%20TO%2010%5D&row_limit=10

Example 7
?query_mode=advanced&query=decision_date:%5B2014-11-18%20TO%202014-11-19%5D&row_limit=10

Example 8
?query_mode=advanced&query=total_workers:%5B5%20TO%20*%5D&row_limit=10

Example 9
?query_mode=advanced&query=(employer_name:apple)AND(job_title:"software%20engineer")&row_limit=10

Example 10
?query_mode=advanced&query=(employer_name:apple)OR(employer_city:cupertino)&row_limit=10

Snapshot statistics

If you include the query parameter stats=true, the API returns statistical information about the snapshot. If you add query=<query_string>, you’ll get statistics for just the matching rows, as shown in the sample output below.

Note: If you use query=, you don’t need to specify row_limit= unless you want the actual rows as well. If you do specify row_limit=, stats will be for all matching rows, not just the rows that are returned.

  "stats": {
    "api_number": {
      "buckets": [
        {
          "doc_count": 11, 
          "key": "608174119700"
        }
      ], 
      "doc_count_error_upper_bound": 0, 
      "fill_rate": 1.0, 
      "sum_other_doc_count": 0
    }, 
    "gas": {
      "avg": 536440.8181818182, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 783719.0, 
      "min": 0.0, 
      "sum": 5900849.0
    }, 
    "month": {
      "avg": 1401604363636.3635, 
      "avg_as_string": "2014-06-01T06:32:43.636Z", 
      "buckets": [
        {
          "doc_count": 1, 
          "key": 1388534400000, 
          "key_as_string": "2014-01-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1391212800000, 
          "key_as_string": "2014-02-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1393632000000, 
          "key_as_string": "2014-03-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1396310400000, 
          "key_as_string": "2014-04-01T00:00:00.000Z"
        }, 
        {
          "doc_count": 1, 
          "key": 1398902400000, 
          "key_as_string": "2014-05-01T00:00:00.000Z"
        }
      ], 
      "count": 11, 
      "doc_count_error_upper_bound": 0, 
      "fill_rate": 1.0, 
      "max": 1414800000000.0, 
      "max_as_string": "2014-11-01T00:00:00.000Z", 
      "min": 1388534400000.0, 
      "min_as_string": "2014-01-01T00:00:00.000Z", 
      "sum": 15417648000000.0, 
      "sum_as_string": "2458-07-26T00:00:00.000Z", 
      "sum_other_doc_count": 6
    }, 
    "oil": {
      "avg": 512950.7272727273, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 755074.0, 
      "min": 0.0, 
      "sum": 5642458.0
    }, 
    "serial_ea8cd08a_f55d_4e12_bc35_aa45596c86dc": {
      "avg": 60157.0, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 109953.0, 
      "min": 8466.0, 
      "sum": 661727.0
    }, 
    "serialid": {
      "avg": 60158.0, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 109954.0, 
      "min": 8467.0, 
      "sum": 661738.0
    }, 
    "water": {
      "avg": 131.54545454545453, 
      "count": 11, 
      "fill_rate": 1.0, 
      "max": 595.0, 
      "min": 0.0, 
      "sum": 1447.0
    }
  }, 

In the JSON response, each child of the stats object represents one column. The information returned for each column depends on the column type (see table on right).

Type Stats returned Descriptions
integer, decimal avg Mean of non-empty values
  count Number of non-empty values
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
  max Maximum value
  min Minimum value
  sum Sum of all non-empty values
string buckets Top five strings by frequency of occurrence, with the key (string) and doc_count (count) for each
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
datetime buckets Top five datetimes by frequency of occurrence, with the key (datetime), key_as_string (datetime), and doc_count (count) for each
  count Number of non-empty values
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
  max Latest date as milliseconds since January 1, 1970 (midnight UTC/GMT)
  max_as_string Latest date in ISO 8601 YYYY-MM-DDThh:mm:ss.sTZD format
  min Earliest date as milliseconds since January 1, 1970 (midnight UTC/GMT)
  min_as_string Earliest date in ISO 8601 YYYY-MM-DDThh:mm:ss.sTZD format
  sum Sum of dates as milliseconds since January 1, 1970 (midnight UTC/GMT)
boolean avg Mean of non-empty values (true = 1; false = 0)
  avg_as_string True if avg >= 0.5; false otherwise
  buckets Values by frequency of occurrence, with the key (numeric value where 1 = true; 0 = false), key_as_string (boolean value), and doc_count (count) for each
  count Number of non-empty values
  fill_rate Fraction of cells that are filled (1.0 = all cells are filled; 0.0 = all cells are empty)
  max Maximum value (1 or 0)
  max_as_string Maximum value as a string (true or false)
  min Minimum value (1 or 0)
  min_as_string Minimum value as a string (true or false)
  sum Sum of all non-empty values (true = 1; false = 0)

Returns

The user model.

Example

The example below writes the session cookie from the response to a file in the current directory.

$ curl -X POST 'https://assembly.company.com/api/account/login' -H 'Content-Type: application/json' -d '{"user":"test@enigma.com","password":"password123"}' -c cookies.txt

You can then use the stored cookie to authenticate future API requests, for example:

curl -X GET 'https://public.enigma.com/api/collections/' -b cookies.txt

Returns

The new personal collection model.

Examples

Returns

The updated personal collection model.

Resource definitions