Check this out:
How to monitor your UltiHash test installation
Discover all the metrics, logs and traces you can track when testing UltiHash
ultihash.io/blog/
how-to-monitor-your-ultihash-test-installation

Training a model, say, a speech-to-text system, often means processing hours of audio data. Your storage needs to keep up with high-throughput reads, or else GPU utilization drops and training slows down. If things start lagging, you need to figure out whether the problem is in preprocessing, data loading, or storage I/O. That’s where monitoring platforms like Uptrace or Datadog come in. These platforms are built on observability: the ability to collect and analyze metrics, logs, and traces to understand how your system behaves under load. They help surface what’s actually slowing things down.
UltiHash supports observability through native support for OpenTelemetry, an open-source framework that captures and transports telemetry data across your stack. It integrates smoothly with systems like Prometheus, Grafana, or Datadog. Metrics and logs flow through a standardized protocol, with no need for custom instrumentation.
With these metrics, teams can:
You can set up alerts when free space runs low or when data egress exceeds expected thresholds. You can trace slowdowns in model inference or RAG retrieval to object-level read latency. You can finally connect storage performance to application behavior, and act before things fall over.
Next up: a step-by-step guide to enabling observability when testing UltiHash locally with our Docker setup.
When you're testing up UltiHash locally with Docker, you don’t need to fly blind. The test setup comes prewired with OpenTelemetry, so you can connect an observability platform and start streaming real metrics, logs, and traces from the get-go. In this guide, we’ll show you how to connect it to Uptrace, a platform many of our users have found easy to get started with. No extra setup, no custom tweaks, it all works out of the box.
Start by setting up your credentials in the terminal.
docker login registry.ultihash.io -u <registry-login>
You’ll find <registry-login> in your UltiHash dashboard. After that, you’ll need to export your environment variables, also available in your dashboard.
#UH_CUSTOMER_ID and UH_ACCESS_TOKEN grant you access to your UltiHash license
export UH_CUSTOMER_ID="<customer-id>"
export UH_ACCESS_TOKEN="<access-token>"
#AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY grant you access to your UltiHash cluster
export AWS_ACCESS_KEY_ID="FANCY-ROOT-KEY"
export AWS_SECRET_ACCESS_KEY="SECRET"
We prepared the configuration YAML files you’ll need to download for the setup, and put them in the folder available below.
Now, you can run the following command to start the UltiHash cluster with the Docker configuration needed to export traces, logs and metrics to Uptrace.
docker compose up -d
Start by creating a bucket, and writing data to your UltiHash cluster. To do so, follow these commands:
# Create a bucket
aws s3api create-bucket --bucket <bucket-name> --endpoint-url http://127.0.0.1:8080
# Upload your dataset (your dataset is a folder)
aws --endpoint-url http://127.0.0.1:8080 s3 cp 'path-to-your-data' s3://bucket-name/ --recursive
# Upload your dataset (your dataset is a file)
aws --endpoint-url http://127.0.0.1:8080 s3 cp 'path-to-your-data' s3://bucket-name/
Go to your browser and go to localhost:14318/login
, this will redirect you to the Uptrace login page where you’ll find generic credentials. This address is private and can only be accessed by you (unless you give someone else permission or make the address public)
Once logged in, you’ll land on the Uptrace overview dashboard
From this view, you should access the UltiHash environment, as described in Image 2. You need to click on the Uptrace tab, a menu will appear with the options Uptrace and UltiHash, there, select UltiHash.
After this step, you’ll have access the UltiHash environment on Uptrace, and will see a dashboard similar to the one below. On this view, you can see the system overview.
To get further into details as to what has happened, you can go to Traces & Logs, which will display all the traced and logs of your system.
Finally you can deep-dive into the metrics of your system in the Metrics tab (in the Metrics Explorer). All metrics UltiHash is tracking can be found there.
You can go a step further, and focus on one metric, or even see how several metrics behave together overtime. Image 5 displays the write requests to UltiHash (storage_write_req
) and the storage used (storage_used_space_gauge
) behave overtime, in this case over the past 30 min. In this case, we see that - quite as expected - storage_used_space_gauge increases as more write requests are made (storage_write_req
).
Voilà! You’re all set to test UltiHash and monitor your operations. For further detailed setup instructions, you can refer to our documentation. Don’t hesitate to reach out to us if you’re unsure about the setup or if you have questions.
UltiHash emits detailed, low-level metrics categorized across services. Here’s a sample breakdown of what you’ll be able to see:
Storage Service Requests
These metrics track how the storage layer is being accessed and how often. Monitoring these gives visibility into the system’s I/O behavior, essential when your performance depends on streaming thousands of files quickly and in parallel.
storage_read_fragment_req
: requests to read a fragmentstorage_write_req
: data write requestsstorage_sync_req
: calls to persist data to diskstorage_remove_fragment_req
: deletionsstorage_used_req
: requests to check space usageEntrypoint (S3 API) Requests
These capture every S3-compatible API call handled by UltiHash. It’s where you see what your applications are doing, creating buckets, uploading objects, listing contents, and how they’re interacting with the storage system.
entrypoint_get_object_req
, put_object_req
, list_objects_req
, etc.: full visibility into every S3-compatible call hitting your clusterCache Utilization and Efficiency
These metrics provide insight into how well the system is using its caching layers, both for reducing I/O and speeding up hot-path queries.
gdv_l1_cache_hit_counter
/ miss_counter
: performance of L1 in-memory cachegdv_l2_cache_hit_counter
/ miss_counter
: L2 cache stats for deeper lookI/O and Resource Monitoring
These give you an overview of system health: how much data is flowing through, how full the system is, and how many connections are being handled at a time.
entrypoint_ingested_data_counter
: volume of uploaded dataentrypoint_egressed_data_counter
: data served out of the systemstorage_available_space_gauge
, storage_used_space_gauge
: real-time storage usageactive_connections
: number of concurrent connections handledThis gives you full visibility into the system’s behavior, from core I/O and API traffic to storage health and caching efficiency.
Training a model, say, a speech-to-text system, often means processing hours of audio data. Your storage needs to keep up with high-throughput reads, or else GPU utilization drops and training slows down. If things start lagging, you need to figure out whether the problem is in preprocessing, data loading, or storage I/O. That’s where monitoring platforms like Uptrace or Datadog come in. These platforms are built on observability: the ability to collect and analyze metrics, logs, and traces to understand how your system behaves under load. They help surface what’s actually slowing things down.
UltiHash supports observability through native support for OpenTelemetry, an open-source framework that captures and transports telemetry data across your stack. It integrates smoothly with systems like Prometheus, Grafana, or Datadog. Metrics and logs flow through a standardized protocol, with no need for custom instrumentation.
With these metrics, teams can:
You can set up alerts when free space runs low or when data egress exceeds expected thresholds. You can trace slowdowns in model inference or RAG retrieval to object-level read latency. You can finally connect storage performance to application behavior, and act before things fall over.
Next up: a step-by-step guide to enabling observability when testing UltiHash locally with our Docker setup.
When you're testing up UltiHash locally with Docker, you don’t need to fly blind. The test setup comes prewired with OpenTelemetry, so you can connect an observability platform and start streaming real metrics, logs, and traces from the get-go. In this guide, we’ll show you how to connect it to Uptrace, a platform many of our users have found easy to get started with. No extra setup, no custom tweaks, it all works out of the box.
Start by setting up your credentials in the terminal.
docker login registry.ultihash.io -u <registry-login>
You’ll find <registry-login> in your UltiHash dashboard. After that, you’ll need to export your environment variables, also available in your dashboard.
#UH_CUSTOMER_ID and UH_ACCESS_TOKEN grant you access to your UltiHash license
export UH_CUSTOMER_ID="<customer-id>"
export UH_ACCESS_TOKEN="<access-token>"
#AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY grant you access to your UltiHash cluster
export AWS_ACCESS_KEY_ID="FANCY-ROOT-KEY"
export AWS_SECRET_ACCESS_KEY="SECRET"
We prepared the configuration YAML files you’ll need to download for the setup, and put them in the folder available below.
Now, you can run the following command to start the UltiHash cluster with the Docker configuration needed to export traces, logs and metrics to Uptrace.
docker compose up -d
Start by creating a bucket, and writing data to your UltiHash cluster. To do so, follow these commands:
# Create a bucket
aws s3api create-bucket --bucket <bucket-name> --endpoint-url http://127.0.0.1:8080
# Upload your dataset (your dataset is a folder)
aws --endpoint-url http://127.0.0.1:8080 s3 cp 'path-to-your-data' s3://bucket-name/ --recursive
# Upload your dataset (your dataset is a file)
aws --endpoint-url http://127.0.0.1:8080 s3 cp 'path-to-your-data' s3://bucket-name/
Go to your browser and go to localhost:14318/login
, this will redirect you to the Uptrace login page where you’ll find generic credentials. This address is private and can only be accessed by you (unless you give someone else permission or make the address public)
Once logged in, you’ll land on the Uptrace overview dashboard
From this view, you should access the UltiHash environment, as described in Image 2. You need to click on the Uptrace tab, a menu will appear with the options Uptrace and UltiHash, there, select UltiHash.
After this step, you’ll have access the UltiHash environment on Uptrace, and will see a dashboard similar to the one below. On this view, you can see the system overview.
To get further into details as to what has happened, you can go to Traces & Logs, which will display all the traced and logs of your system.
Finally you can deep-dive into the metrics of your system in the Metrics tab (in the Metrics Explorer). All metrics UltiHash is tracking can be found there.
You can go a step further, and focus on one metric, or even see how several metrics behave together overtime. Image 5 displays the write requests to UltiHash (storage_write_req
) and the storage used (storage_used_space_gauge
) behave overtime, in this case over the past 30 min. In this case, we see that - quite as expected - storage_used_space_gauge increases as more write requests are made (storage_write_req
).
Voilà! You’re all set to test UltiHash and monitor your operations. For further detailed setup instructions, you can refer to our documentation. Don’t hesitate to reach out to us if you’re unsure about the setup or if you have questions.
UltiHash emits detailed, low-level metrics categorized across services. Here’s a sample breakdown of what you’ll be able to see:
Storage Service Requests
These metrics track how the storage layer is being accessed and how often. Monitoring these gives visibility into the system’s I/O behavior, essential when your performance depends on streaming thousands of files quickly and in parallel.
storage_read_fragment_req
: requests to read a fragmentstorage_write_req
: data write requestsstorage_sync_req
: calls to persist data to diskstorage_remove_fragment_req
: deletionsstorage_used_req
: requests to check space usageEntrypoint (S3 API) Requests
These capture every S3-compatible API call handled by UltiHash. It’s where you see what your applications are doing, creating buckets, uploading objects, listing contents, and how they’re interacting with the storage system.
entrypoint_get_object_req
, put_object_req
, list_objects_req
, etc.: full visibility into every S3-compatible call hitting your clusterCache Utilization and Efficiency
These metrics provide insight into how well the system is using its caching layers, both for reducing I/O and speeding up hot-path queries.
gdv_l1_cache_hit_counter
/ miss_counter
: performance of L1 in-memory cachegdv_l2_cache_hit_counter
/ miss_counter
: L2 cache stats for deeper lookI/O and Resource Monitoring
These give you an overview of system health: how much data is flowing through, how full the system is, and how many connections are being handled at a time.
entrypoint_ingested_data_counter
: volume of uploaded dataentrypoint_egressed_data_counter
: data served out of the systemstorage_available_space_gauge
, storage_used_space_gauge
: real-time storage usageactive_connections
: number of concurrent connections handledThis gives you full visibility into the system’s behavior, from core I/O and API traffic to storage health and caching efficiency.