JSON Interface
All Heron Tracker endpoints return a JSON object with the following information:
status
— One of the following:success
,failure
.executiontime
— The time taken to return the HTTP result, in seconds.message
— Some endpoints return special messages in this field for certain requests. Often, this field will be an empty string. Afailure
status will always have a message.result
— The result payload of the request. The contents will depend on the endpoint.version
— The Tracker API version.
Endpoints
/
(redirects to/topologies
)/clusters
/topologies
/topologies/states
/topologies/info
/topologies/logicalplan
/topologies/physicalplan
/topologies/executionstate
/topologies/schedulerlocation
/topologies/metrics
/topologies/metricstimeline
/topologies/metricsquery
/topologies/containerfiledata
/topologies/containerfilestats
/topologies/exceptions
/topologies/exceptionsummary
/topologies/pid
/topologies/jstack
/topologies/jmap
/topologies/histo
/machines
All of these endpoints are documented in the sections below.
/clusters
Returns JSON list of all the clusters.
/topologies
Returns JSON describing all currently available topologies
$ curl "http://heron-tracker-url/topologies?cluster=cluster1&environ=devel"Parameters
cluster
(optional) — The cluster parameter can be used to filter topologies that are running in this cluster.environ
(optional) — The environment parameter can be used to filter topologies that are running in this environment.
/topologies/logicalplan
Returns a JSON representation of the logical plan of a topology.
$ curl "http://heron-tracker-url/topologies/logicalplan?cluster=cluster1&environ=devel&topology=topologyName"Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topology
The resulting JSON contains the following
spouts
— A set of JSON objects representing each spout in the topology. The following information is listed for each spout:source
— The source of tuples for the spout.type
— The type of the spout, e.g.kafka
,kestrel
, etc.outputs
— A list of streams to which the spout outputs tuples.
bolts
— A set of JSON objects representing each bolt in the topology.outputs
— A list of streams to which the bolt outputs tuples.inputs
— A list of inputs for the bolt. An input is represented by JSON dictionary containing following information.component_name
— Name of the component this bolt is receiving tuples from.stream_name
— Name of the stream from which the tuples are received.grouping
— Type of grouping used to receive tuples, exampleSHUFFLE
orFIELDS
.
/topologies/physicalplan
Returns a JSON representation of the physical plan of a topology.
$ curl "http://heron-tracker-url/topologies/physicalplan?cluster=datacenter1&environ=prod&topology=topologyName"Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topology
The resulting JSON contains following information
- All spout and bolt components, with lists of their instances.
stmgrs
— A list of JSON dictionary, containing following information of each stream manager.host
— Hostname of the machine this container is running on.pid
— Process ID of the stream manager.cwd
— Absolute path to the directory from where container was launched.joburl
— URL to browse thecwd
throughheron-shell
.shell_port
— Port to accessheron-shell
.logfiles
— URL to browse instance log files throughheron-shell
.id
— ID for this stream manager.port
— Port at which this stream manager accepts connections from other stream managers.instance_ids
— List of instance IDs that constitute this container.
instances
— A list of JSON dictionaries containing following information for each instanceid
— Instance ID.name
— Component name of this instance.logfile
— Link to log file for this instance, that can be read throughheron-shell
.stmgrId
— Its stream manager’s ID.
config
— Various topology configs. Some of the examples are:topology.message.timeout.secs
— Time after which a tuple should be considered as failed.topology.acking
— Whether acking is enabled or not.
/topologies/schedulerlocation
Returns a JSON representation of the scheduler location of the topology.
$ curl "http://heron-tracker-url/topologies/schedulerlocation?cluster=datacenter1&environ=prod&topology=topologyName"Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topology
The SchedulerLocation
mainly contains the link to the job on the scheduler,
for example, the Aurora page for the job.
/topologies/executionstate
Returns a JSON representation of the execution state of the topology.
$ curl "http://heron-tracker-url/topologies/executionstate?cluster=datacenter1&environ=prod&topology=topologyName"Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topology
Each execution state object lists the following:
cluster
— The cluster in which the topology is runningenviron
— The environment in which the topology is runningrole
— The role with which the topology was launchedjobname
— Same as topology namesubmission_time
— The time at which the topology was submittedsubmission_user
— The user that submitted the topology (can be same asrole
)release_username
— The user that generated the Heron release for the topologyrelease_version
— Release versionhas_physical_plan
— Whether the topology has a physical planhas_tmaster_location
— Whether the topology has a Topology Master Locationhas_scheduler_location
— Whether the topology has a Scheduler Locationviz
— Metric visualization UI URL for the topology if it was configured
/topologies/states
Returns a JSON list of execution states of topologies in all the cluster.
$ curl "http://heron-tracker-url/topologies/states?cluster=cluster1&environ=devel"Parameters
cluster
(optional) — The cluster parameter can be used to filter topologies that are running in this cluster.environ
(optional) — The environment parameter can be used to filter topologies that are running in this environment.
/topologies/info
Returns a JSON representation of a dictionary containing logical plan, physical plan,
execution state, scheduler location and TMaster location for a topology, as described above.
TMasterLocation
is the location of the TMaster, including its host,
port, and the heron-shell port that it exposes.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topology
/topologies/containerfilestats
Returns the file stats for a container. This is the output of the command ls -lh
when run
in the directory where the heron-controller launched all the processes.
This endpoint is mainly used by ui for exploring files in a container.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologycontainer
(required) — Container IDpath
(optional) — Path relative to the directory where heron-controller is launched. Paths are not allowed to start with a/
or contain a..
.
/topologies/containerfiledata
Returns the file data for a file of a container.
This endpoint is mainly used by ui for exploring files in a container.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologycontainer
(required) — Container IDpath
(required) — Path to the file relative to the directory where heron-controller is launched. Paths are not allowed to start with a/
or contain a..
.offset
(required) — Offset from the beggining of the file.length
(required) — Number of bytes to be returned.
/topologies/metrics
Returns a JSON map of instances of the topology to their respective metrics.
To filter instances returned use the instance
parameter discussed below.
Note that these metrics come from TMaster, which only holds metrics
for last 3 hours minutely data, as well as cumulative values. If the interval
is greater than 10800
seconds, the values will be for all-time metrics.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologycomponent
(required) — Component namemetricname
(required, repeated) — Names of metrics to fetchinterval
(optional) — For how many seconds, the metrics should be fetched for (max 10800 seconds)instance
(optional) — IDs of the instances. If not present, return for all the instances.
/topologies/metricstimeline
Returns a JSON map of instances of the topology to their respective metrics timeline.
To filter instances returned use the instance
parameter discussed below.
The difference between this and /metrics
endpoint above, is that /metrics
will report
cumulative value over the period of interval
provided. On the other hand, /metricstimeline
endpoint will report minutely values for each metricname for each instance.
Note that these metrics come from TMaster, which only holds metrics for last 3 hours minutely data, as well as cumulative all-time values. If the starttime is older than 3 hours ago, those minutes would not be part of the response.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologycomponent
(required) — Component namemetricname
(required, repeated) — Names of metrics to fetchstarttime
(required) — Start time for the metrics (must be within last 3 hours)endtime
(required) — End time for the metrics (must be within last 3 hours, and greater thanstarttime
)instance
(optional) — IDs of the instances. If not present, return for all the instances.
/topologies/metricsquery
Executes the metrics query for the topology and returns the result in form of minutely timeseries. A detailed description of query language is given below.
Note that these metrics come from TMaster, which only holds metrics for last 3 hours minutely data, as well as cumulative all-time values. If the starttime is older than 3 hours ago, those minutes would not be part of the response.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologystarttime
(required) — Start time for the metrics (must be within last 3 hours)endtime
(required) — End time for the metrics (must be within last 3 hours, and greater thanstarttime
)query
(required) — Query to be executed
/topologies/exceptionsummary
Returns summary of the exceptions for the component of the topology. Duplicated exceptions are combined together and includes the number of occurances, first occurance time and latest occurance time.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologycomponent
(required) — Component nameinstance
(optional) — IDs of the instances. If not present, return for all the instances.
/topologies/exceptions
Returns all exceptions for the component of the topology.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologycomponent
(required) — Component nameinstance
(optional) — IDs of the instances. If not present, return for all the instances.
/topologies/pid
Returns the PID of the instance jvm process.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologyinstance
(required) — Instance ID
/topologies/jstack
Returns the thread dump of the instance jvm process.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologyinstance
(required) — Instance ID
/topologies/jmap
Issues the jmap
command for the instance, and saves the result in a file.
Returns the path to the file that can be downloaded externally.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologyinstance
(required) — Instance ID
/topologies/histo
Returns histogram for the instance jvm process.
Parameters
cluster
(required) — The cluster in which the topology is runningenviron
(required) — The environment in which the topology is runningtopology
(required) — The name of the topologyinstance
(required) — Instance ID
/machines
Returns JSON describing all machines that topologies are running on.
$ curl "http://heron-tracker-url/machines?topology=mytopology1&cluster=cluster1&environ=prod"Parameters
cluster
(optional) — The cluster parameter can be used to filter machines that are running the topologies in this cluster only.environ
(optional) — The environment parameter can be used to filter machines that are running the topologies in this environment only.topology
(optional, repeated) — Name of the topology. Bothcluster
andenviron
are required if thetopology
parameter is present
Metrics Query Language
Metrics queries are useful when some kind of aggregated values are required. For example,
to find the total number of tuples emitted by a spout, SUM
operator can be used, instead
of fetching metrics for all the instances of the corresponding component, and then summing them.
Terminology
- Univariate Timeseries — A timeseries is called univariate if there is only one set of minutely data. For example, a timeseries representing the sums of a number of timeseries would be a univariate timeseries.
- Multivariate Timeseries — A set of multiple timeseries is collectively called multivariate. Note that these timeseries are associated with their instances.
Operators
TS
TS(componentName, instance, metricName)Example:
TS(component1, *, __emit-count/stream1)Time Series Operator. This is the basic operator that is responsible for getting metrics from TMaster. Accepts a list of 3 elements:
- componentName
- instance - can be “*” for all instances, or a single instance ID
- metricName - Full metric name with stream id if applicable
Returns a univariate time series in case of a single instance id given, otherwise returns a multivariate time series.
DEFAULT
DEFAULT(0, TS(component1, *, __emit-count/stream1)) <-- If the second operator returns more than one timeline, so will the DEFAULT operator. DEFAULT(100.0, SUM(TS(component2, *, __emit-count/default))) <-- Second operator can be any operatorDefault Operator. This operator is responsible for filling missing values in the metrics timeline. Must have 2 arguments
- First argument is a numeric constant representing the number to fill the missing values with
- Second one must be one of the operators, that return the metrics timeline
Returns a univariate or multivariate time series, based on what the second operator is.
SUM
SUM(TS(component1, instance1, metric1), DEFAULT(0, TS(component1, *, metric2)))Sum Operator. This operator is used to take sum of all argument time series. It can have any number of arguments, each of which must be one of the following two types:
- Numeric constants, which will fill in the missing values as well, or
- Operator, which returns one or more timelines
Returns only a single timeline representing the sum of all time series for each timestamp. Note that “instance” attribute is not there in the result.
MAX
MAX(100, TS(component1, *, metric1))Max Operator. This operator is used to find max of all argument operators for each individual timestamp. Each argument must be one of the following types:
- Numeric constants, which will fill in the missing values as well, or
- Operator, which returns one or more timelines
Returns only a single timeline representing the max of all the time series for each timestamp. Note that “instance” attribute is not included in the result.
PERCENTILE
PERCENTILE(99, TS(component1, *, metric1))Percentile Operator. This operator is used to find a quantile of all timelines retuned by the arguments, for each timestamp.
This is a more general type of query similar to MAX. Note that PERCENTILE(100, TS...)
is equivalent to Max(TS...)
.
Each argument must be either constant or Operators.
First argument must always be the required Quantile.
- Quantile (first argument) - Required quantile. 100 percentile = max, 0 percentile = min.
- Numeric constants will fill in the missing values as well,
- Operator - which returns one or more timelines
Returns only a single timeline representing the quantile of all the time series for each timestamp. Note that “instance” attribute is not there in the result.
DIVIDE
DIVIDE(TS(component1, *, metrics1), 100)Divide Operator. Accepts two arguments, both can be univariate or multivariate. Each can be of one of the following types:
- Numeric constant will be considered as a constant time series for all applicable timestamps, they will not fill the missing values
- Operator - returns one or more timelines
Three main cases are:
- When both operands are multivariate
- Divide operation will be done on matching data, that is, with same instance id.
- If the instances in both the operands do not match, error is thrown.
- Returns multivariate time series, each representing the result of division on the two corresponding time series.
- When one operand is univariate, and other is multivariate
- This includes division by constants as well.
- The univariate operand will participate with all time series in multivariate.
- The instance information of the multivariate time series will be preserved in the result.
- Returns multivariate time series.
- When both operands are univariate
- Instance information is ignored in this case
- Returns univariate time series which is the result of division operation.
MULTIPLY
MULTIPLY(10, TS(component1, *, metrics1))Multiply Operator. Has same conditions as division operator. This is to keep the API simple. Accepts two arguments, both can be univariate or multivariate. Each can be of one of the following types:
- Numeric constant will be considered as a constant time series for all applicable timestamps, they will not fill the missing values
- Operator - returns one or more timelines
Three main cases are:
- When both operands are multivariate
- Multiply operation will be done on matching data, that is, with same instance id.
- If the instances in both the operands do not match, error is thrown.
- Returns multivariate time series, each representing the result of multiplication on the two corresponding time series.
- When one operand is univariate, and other is multivariate
- This includes multiplication by constants as well.
- The univariate operand will participate with all time series in multivariate.
- The instance information of the multivariate time series will be preserved in the result.
- Returns multivariate timeseries.
- When both operands are univariate
- Instance information is ignored in this case
- Returns univariate timeseries which is the result of multiplication operation.
SUBTRACT
SUBTRACT(TS(component1, instance1, metrics1), TS(componet1, instance1, metrics2)) SUBTRACT(TS(component1, instance1, metrics1), 100)Subtract Operator. Has same conditions as division operator. This is to keep the API simple. Accepts two arguments, both can be univariate or multivariate. Each can be of one of the following types:
- Numeric constant will be considered as a constant time series for all applicable timestamps, they will not fill the missing values
- Operator - returns one or more timelines
Three main cases are:
- When both operands are multivariate
- Subtract operation will be done on matching data, that is, with same instance id.
- If the instances in both the operands do not match, error is thrown.
- Returns multivariate time series, each representing the result of subtraction on the two corresponding time series.
- When one operand is univariate, and other is multivariate
- This includes subtraction by constants as well.
- The univariate operand will participate with all time series in multivariate.
- The instance information of the multivariate time series will be preserved in the result.
- Returns multivariate time series.
- When both operands are univariate
- Instance information is ignored in this case
- Returns univariate time series which is the result of subtraction operation.
RATE
RATE(SUM(TS(component1, *, metrics1))) RATE(TS(component1, *, metrics2))Rate Operator. This operator is used to find rate of change for all timeseries. Accepts a only a single argument, which must be an Operators which returns univariate or multivariate time series. Returns univariate or multivariate time series based on the argument, with each timestamp value corresponding to the rate of change for that timestamp.