New: We've rebuilt the SEO & AI Visibility dashboard — See what changed >

Home Features Pricing About

Server Monitoring

Your website can be up, but your server might be struggling.

Uptime checks tell you whether your site responds. They don't tell you that your server is running at 95% memory, that MySQL has been restarting every few minutes, or that your disk is two days from full. SiteVitals server monitoring fills that gap.

Start Monitoring Your Server Read the docs →

Available on Marketer from £36.00/mo

🖥️

Resource metrics every minute

CPU, memory, swap, disk, and load average collected every 60 seconds. Under 0.3% CPU per run. No persistent daemon — the agent runs, reports, and exits.

⚙️

Service health checks

Monitor individual services — nginx, Apache, MySQL, Redis, PHP-FPM, Supervisor and others — discovered automatically on install.

⏱️

Sustained breach detection

CPU and memory alerts only fire after the breach has been sustained for a configurable period. A two-second CPU spike never wakes anyone up.

📈

Escalation channels

Configure multiple alert channels with different delays — notify the on-call developer after 5 minutes, the manager after 30.

🔴

Agent offline detection

If the agent stops reporting for more than 10 minutes, SiteVitals opens an offline incident and alerts your channels automatically.

The Gap Uptime Monitoring Leaves

Your site returned 200.
Everything else was on fire.

An uptime check confirms your server responded. It doesn't know that the response took three times longer than usual because your database is under pressure. It doesn't know that your disk is 97% full and the next log rotation will bring everything down. It doesn't know that PHP-FPM has restarted eleven times this hour.

Server monitoring gives you the view from inside — the resource utilisation, the process health, the service status — that external checks fundamentally cannot provide. Uptime monitoring tells you when your site goes down. Server monitoring tells you why — and warns you before it happens.

And when something does go wrong, having a minute-by-minute record of what was happening — CPU at 94%, MySQL restarting, load spiking — means you spend time fixing the problem rather than reconstructing what caused it.

💾

The disk that filled silently

Log files grow. Backup scripts run. Uploads accumulate. Disk fills gradually and then all at once — taking down databases, mail servers, and application writes simultaneously. Disk alerts fire immediately with no sustain delay, giving you time to act.

🗄️

The database that kept restarting

MySQL or PostgreSQL restarting repeatedly is often the first sign of a memory exhaustion problem, a corrupted table, or a runaway query. The site may still respond — slowly. Service monitoring catches this pattern before visitors notice.

📈

The load spike that wasn't a spike

A brief high CPU reading is noise. Ten minutes of sustained high load is a problem. Sustained breach detection means you only get alerted when something is genuinely wrong — not every time a cron job runs.

🔧

The service nobody noticed had stopped

A queue worker stops processing. A cache server goes down and requests start hitting the database directly. A mail relay stops accepting connections. None of these prevent your site returning 200 — but all of them cause real problems.

What We Monitor

Every metric that matters.
Checked every minute.

A lightweight agent collects data every 60 seconds over HTTPS. Under 0.3% CPU per run and approximately 1.5 MB of network traffic per day.

🧠

CPU Usage

Average utilisation across all cores. Alerts only fire after a sustained breach — not during normal spikes.

75% warning 90% critical 5 min sustain

💾

Memory & Swap

RAM as a percentage and in GB, with swap tracked separately. Catches slow memory leaks before they become critical.

75% warning 90% critical 5 min sustain

🗄️

Disk Usage

Root filesystem usage as a percentage and in GB. Disk alerts fire immediately — a full disk doesn't fix itself.

75% warning 90% critical Immediate

📊

Load Average

1, 5, and 15-minute load averages normalised against CPU core count — consistent thresholds across server sizes.

1m / 5m / 15m Per-core normalised Configurable

🔍

Top Processes

Top 10 processes by CPU captured with every check, including name, PID, CPU %, and memory %. Answers "what caused this?" without guesswork.

Captured per check Shown at incident open No config needed

⚙️

Service Health

Individual service monitoring for nginx, MySQL, Redis, PHP-FPM, Supervisor and others, auto-discovered on install.

Auto-discovery Crash-loop detection Immediate alerts

🔴

Agent Offline

If the agent stops reporting for more than 10 minutes, an offline incident opens and your channels are notified.

10 min threshold Auto incident Recovery alert

All thresholds and sustain delays are configurable per server.

Configure thresholds →

How It Works

One command to install.
Everything else is automatic.

A shell script runs every minute via systemd or cron. No open ports, no inbound firewall changes, no persistent daemon.

Add a server in SiteVitals

Create a server record in your dashboard and copy the generated install command — a single line with a unique API key scoped to that server.

Takes under a minute. Optionally link to your monitored sites so they appear in server alert emails.

Run the install command as root

Paste the command on your server. The installer auto-configures systemd or cron, runs an initial metrics check, and discovers your running services.

Tested on Ubuntu, Debian, Amazon Linux, CentOS, and AlmaLinux. Installs jq automatically if not present.

Select services to monitor

A checklist of discovered services appears in your dashboard. Recommended services are pre-ticked. Save your selection — the agent picks it up within 60 seconds.

Re-run discovery any time after installing new software.

Configure thresholds and alert channels

Set warning and critical thresholds per metric, configure the sustain delay, and add alert channels. Multiple channels with different delays route the right alert to the right person.

Email, Slack, webhook, or in-app. Escalate to a manager after 30 minutes if unacknowledged.

Read the full quick start guide

Sustained Breach Detection

Alerts that fire when
something is actually wrong.

CPU spikes. Every server has them. A backup job runs, a cron fires, a burst of traffic arrives. These are normal and brief. An alert every time CPU exceeds 75% for thirty seconds is noise, not signal.

SiteVitals only opens an incident when a metric has been continuously above its threshold for the configured sustain period — default five minutes for CPU and memory. Disk alerts fire immediately because a full disk doesn't self-correct. Service alerts also fire immediately.

The result is alerts you can trust. When SiteVitals wakes you up, something is genuinely and sustainably wrong — not a momentary blip that resolved itself before you reached for your phone.

🧠

CPU warn 75% / crit 90% 5 min sustain

Ignores deploy spikes and cron bursts

💾

Memory warn 75% / crit 90% 5 min sustain

Catches slow memory leaks before they become critical

📊

Load warn 75% / crit 90% 5 min sustain

Normalised against core count — consistent across server sizes

🗄️

Disk warn 75% / crit 90% Immediate

No delay — a full disk doesn't fix itself

⚙️

Service warn Down / crit Crash-loop Immediate

Service stopped or crash-looping alerts fire straight away

🔴

Agent warn Offline / crit > 10 min 10 min

Agent stops reporting — offline incident opens automatically

All thresholds and sustain delays are configurable per server in Settings → Alert Thresholds.

Server monitoring that costs less than an hour of downtime.

Get started with our free tools today. When you're ready for continuous server_monitoring monitoring, our plans are simple, transparent, and built for sites of all sizes.

Plans start from

£36.00/mo

View All Plans & Features

Questions

Things people often ask us.

If something isn't covered here, we're genuinely happy to answer it. We're a real team and we actually respond.

What does the agent install on my server?

A single shell script at /usr/local/bin/sitevitals-agent, a systemd timer (or cron job on systems without systemd), and a config file at /etc/sitevitals/agent.conf. No persistent daemon, no open ports, no inbound firewall changes required.

Which Linux distributions are supported?

The agent is a plain bash script with no dependencies beyond standard Linux tools. It has been tested on Ubuntu 20.04 and 24.04, Debian 10 and 12, Amazon Linux 2, CentOS 7, and AlmaLinux 8. Any modern Linux distribution should work.

How does service discovery work?

On install, the agent scans your running systemd services and sends the list to SiteVitals. A checklist appears in your dashboard with recommended services pre-selected. You confirm what you want to monitor and save — the agent picks up the change within 60 seconds.

What is sustained breach detection?

Rather than opening an incident the moment a metric crosses a threshold, SiteVitals waits until it has been continuously above the threshold for a configurable period — default five minutes for CPU and memory. This eliminates false alarms from deployment spikes or traffic bursts. Disk and service alerts fire immediately.

What happens if my server goes completely offline?

If the agent stops reporting for more than 10 minutes, SiteVitals opens an offline incident and alerts your configured channels. When the server comes back online and the agent resumes reporting, a recovery notification is sent automatically.

How is load average shown — what does a percentage mean?

Load average is normalised as a percentage of available CPU cores. A load_5 of 1.0 on a single-core server equals 100% — fully utilised. On a 4-core server, a load_5 of 4.0 also equals 100%. This makes thresholds consistent across different server sizes.

Does the agent need internet access?

Yes — the agent makes outbound HTTPS requests to the SiteVitals API every minute. No inbound connections are required. You need to allow outbound port 443 to sitevitals.co.uk if your server sits behind a firewall.

How much resource does the agent use?

Very little. The agent consumes under 0.3% CPU per run and approximately 1.5 MB of network traffic per day. It exits completely between runs — there is no persistent process in memory.

How many servers can I monitor?

This depends on your plan. The Starter plan includes one server, Marketer includes two, and the Agency plan includes ten. Contact us if you need more.

One command. Your server, properly watched.

CPU, memory, disk, load, and every service you care about — collected every minute, with alerts that only fire when something is genuinely and sustainably wrong.

Start Monitoring Your Server Read the Documentation