Data Feeds

Data Feed is a beta feature in active development. To enable Data Feed, first select anywhere on the Chef Automate interface and enter ‘feat’ to open the feature flags window and then toggle “Chef Automate Data Feed” to the “ON” position.

The Data Feed service sends node data to a 3rd party service. This can be useful when updating configuration management databases, external security dashboards and IT service management platforms. The following types of information are sent:

  • Ohai data gathered from each managed node - This data includes hardware, operating system, and installed program information. Some variation depends on the managed operating system
  • Configuration information about each managed node - This information includes Chef Client Run status, Runlists, Cookbooks, and Recipes being ran against each node
  • Compliance information about each node that shows the compliance state - This information includes passed and failed controls for each profile executed against that node

A Data Feed operates by doing the following:

  • Every 4 hours, the data-feed-service will aggregate the client runs and compliance reports from the previous 4 hours and send this information to the registered destinations. This time interval is 4 hours by default, but is configurable
  • If there are no destinations, aggregation will not occur
  • The data aggregates and sends in batches of 50 nodes at a time. The batch amount is 50 by default, but is configurable

By default, only Admin users of Chef Automate may create and manage Data Feeds.

Adding a Data Feed Instance

A single Data Feed instance connects to one 3rd party endpoint. Create as many Data Feed instances as needed.

To add a Data Feed instance in Chef Automate:

Setup Data Feed Page

  1. In the Settings tab, navigate to Data Feeds in the sidebar
  2. Select Create Data Feed
  3. Enter a unique Data Feed name
  4. Enter the URL for your Data Feed endpoint, including any specific port details
  5. Enter the Username and Password that your 3rd party endpoint requires for authentication
  6. Select Test Data Feed to begin validating the connection details
  7. Once the test is successful, select Create Data Feed to save your Data Feed configuration

Edit a Data Feed Instance

To edit a Data Feed instance in Chef Automate:

  1. In Data Feeds, select the Data Feed name to open its detail page
  2. Edit the Data Feed name or URL
  3. Use the Save button to save your changes

Delete a Data Feed Instance

To delete a Data Feed instance in Chef Automate:

  1. In Data Feeds, select Delete Data Feed from the menu at the end of the table row
  2. Select Delete Data Feed to confirm permanent deletion of this Data Feed

Configuring Global Data Feed Behavior

The settings in config.toml apply across all configured Data Feed instances.

Modify Data Feed behavior with configuration settings in config.toml.

  1. Navigate to /hab/svc/data-feed-service/config/config.toml using the Chef Automate command-line tool.
  2. Change one or more configuration settings to reflect the desired global Data Feed behavior:
  • Update the feed_interval setting to change the interval for the Data Feed collection. The default value is four hours
  • Update the node_batch_size setting to change the number of sets of node data sent in each individual batch to your endpoint. The default value is 50 nodes
  • Use the updated_nodes_only setting to determine what data to include in each export. The default setting is true, which causes the aggregation of only the changed data of updated nodes since the last export. Set updated_nodes_only to false and it aggregates all data of updated nodes since the last export
  • To reduce the IP address range for the collected and processed node data, update the disable_cidr_filter setting to false and update the cidr_filter setting to cover the required IP address range. For example, you may wish to send only production or test node traffic
  • Use the accepted_status_codes setting to define an array of HTTP status codes that the Data Feed Service will treat as success if returned by the 3rd party endpoint. If the status code is not in the accepted_status_codes list, then an error will be logged
  1. Apply your changes with the Chef Automate command-line tool:
    chef-automate config patch /hab/svc/data-feed-service/config/config.toml

Config.toml Example

    [service]

    host = "localhost"
    port = 14001
    feed_interval = "4h"
    asset_page_size = 100
    reports_page_size = 1000
    node_batch_size = 50
    updated_nodes_only = true
    disable_cidr_filter = true
    cidr_filter = "0.0.0.0/0"
    external_fqdn = ""
    accepted_status_codes = [ 200, 201, 202, 203, 204 ]

To debug any issues with the Data Feed Service in Chef Automate, update the following section in config.toml by changing the log_level value to “debug”:

    [log]
    log_format = "text"
    log_level = "info"

Data Feed Output Syntax and Details

The outputted data from Data Feed consists of line-separated JSON strings. Each line represents the data for one node, and contains the following properties:

    {
    "attributes": {
     "node_id": "",
     "name": "",
     "run_list": [],
     "chef_environment": "",
     "normal": {},
     "default": {},
     "override":{},
     "automatic":{},
     "normal_value_count": 0,
     "default_value_count": 1,
     "override_value_count": 1,
     "all_value_count": 10,
     "automatic_value_count": 8
    },
    "report": { ... },
    "client_run": { ... },
    "node": {
     "automate_fqdn": "",
     "ip_address" : "",
     "mac_address": "",
     "description":"",
     "serial_number":"",
     "os_service_pack":""
     }
    }