monitoring
import { CodeExample } from "components/CodeExample";
To ensure that your instance of Horizon is performing correctly, we encourage you to monitor it, and provide both logs and metrics to do so.
Horizon will output logs to standard out. Information about what requests are coming in will be reported, but more importantly, warnings or errors will also be emitted by default. A correctly running Horizon instance will not output any warning or error log entries.
Metrics are collected while a Horizon process is running and they are exposed at the /metrics
path. You can see an example at https://horizon-testnet.stellar.org/metrics.
Below we present a few standard log entries with associated fields. You can use them to build metrics and alerts. Please note that these represent Horizon app metrics only. You should also monitor your hardware metrics like CPU or RAM Utilization.
Starting HTTP request
Key | Value |
|
|
| Value of |
| Value of |
| Value of |
| Value of |
| First value of |
| Value of |
| IP of a client sending HTTP request |
| IP and port of a client sending HTTP request |
| HTTP method ( |
| Full request path, including query string (ex. |
| Boolean, |
| Value of |
| Random value that uniquely identifies a request, attached to all logs within this HTTP request |
Finished HTTP request
Key | Value |
|
|
| Number of response bytes sent |
| Value of |
| Value of |
| Value of |
| Value of |
| Duration of request in seconds |
| First value of |
| Value of |
| IP of a client sending HTTP request |
| IP and port of a client sending HTTP request |
| HTTP method ( |
| Full request path, including query string (ex. |
| Route pattern without query string (ex. |
| HTTP status code (ex. |
| Boolean, |
| Value of |
| Random value that uniquely identifies a request, attached to all logs within this HTTP request |
Metrics
Using the entries above you can build metrics that will help understand performance of a given Horizon node. For example:
Number of requests per minute.
Number of requests per route (the most popular routes).
Average response time per route.
Maximum response time for non-streaming requests.
Number of streaming vs. non-streaming requests.
Number of rate-limited requests.
List of rate-limited IPs.
Unique IPs.
The most popular SDKs/apps sending requests to a given Horizon node.
Average ingestion time of a ledger.
Average ingestion time of a transaction.
Alerts
Below are example alerts with potential causes and solutions. Feel free to add more alerts using your metrics:
Alert | Cause | Solution |
Spike in number of requests | Potential DoS attack | Lower rate-limiting threshold |
Large number of rate-limited requests | Rate-limiting threshold too low | Increase rate-limiting threshold |
Ingestion is slow | Horizon server spec too low | Increase hardware spec |
Spike in average response time of a single route | Possible bug in a code responsible for rendering a route | Report an issue in Horizon repository. |
I'm Stuck! Help!
If any of the above steps don't work or you are otherwise prevented from correctly setting up Horizon, please join our community and let us know. Either post a question at our Stack Exchange or chat with us on Keybase in #dev_discussion to ask for help.
Last updated