ErrSight Blog

Introducing ErrSight: see every error before your users do

2026-06-09T10:00:00+05:30

Your users should never be the ones who tell you production is broken. Today we’re launching ErrSight: real-time error tracking and log management in a single tool, built to catch problems before a support ticket ever lands.

The problem: errors reach users first

Most error tracking setups fail in one of two predictable ways.

The first is silence. An exception fires deep in a background job, gets swallowed by a generic rescue, and the only signal you get is a confused user emailing support three days later. By then the stack trace is gone and you’re reverse-engineering a bug from a screenshot.

The second is noise, and cost. The incumbents will happily capture everything, then bill you for the privilege across a tangle of event categories, reserved volumes, and pay-as-you-go overages. You tune sample rates not because it’s good engineering, but because you’re scared of the invoice. The dashboard fills with duplicate alerts for the same root cause until nobody reads them.

We wanted error tracking that is loud about real problems, quiet about duplicates, and honest about the bill. So we built one.

What ErrSight is

ErrSight is real-time error tracking and log management, together, with a tagline we actually mean: see every error before your users do.

It does two jobs that usually require two vendors:

Exception tracking: automatic capture, grouped into actionable issues you can triage.
Log management: a live, searchable stream of everything your app is saying, in the same place.

You install it in under two minutes, ship it where you already ship, and stop bouncing between a logging tool and an error tool to reconstruct a single incident.

The core, in plain terms

Automatic capture and fingerprinting

ErrSight captures exceptions automatically, with no manual try/catch plumbing around every call site. The important part is what happens next: identical errors are collapsed via fingerprinting into one issue. A bug that throws ten thousand times is one line in your dashboard with a count of ten thousand, not ten thousand lines. That’s the difference between a signal and a denial-of-service attack on your own attention.

A live log viewer that feels like a terminal

The live log viewer is a terminal-style stream with infinite scroll and keyword search. It’s the view you reach for at 2 a.m.: watch events arrive in real time, search for a request ID, and follow the story across services without grepping ten boxes by hand.

Automatic user context

Every event carries user context automatically: id, email, session, and plan. When an error lands, you already know who hit it and what tier they’re on, so you can tell “one flaky test account” apart from “every Business customer at once” instantly.

Triage that respects your time

Issues aren’t just a feed. You can mark, assign, and snooze them, so the on-call rotation stays focused on what’s actually actionable instead of drowning in a wall of red.

Sub-millisecond overhead

Observability shouldn’t tax the thing it observes. ErrSight batches events off the request path, so per-request overhead is sub-millisecond and nothing is dropped on exit. We break down the mechanics in the Rails guide.

Ship where you already ship

ErrSight meets your stack where it lives. Here’s what’s shipping today and what’s on the way.

Platform	Package	Status
Ruby on Rails	`errsight`	Available
Python (3.8+)	`pip install errsight`	Available
Rust (1.85+)	`errsight = "0.1"`	Available
React / JavaScript	`errsight`	Available
React Native	`errsight-rn`	Available
REST API	`POST /api/v1/events`	Available
Node, Go, PHP/Laravel, Elixir/Phoenix	n/a	Coming soon

A Rails setup is the canonical “two-minute install”: one gem and one initializer:

# config/initializers/errsight.rb
Errsight.configure { |c| c.api_key = ENV["ERRSIGHT_KEY"] }

That auto-captures every Rails exception and routing error, broadcasts from Rails.logger at all levels, and attaches request, user, controller action, and URL context. It has first-class Devise and ActiveAdmin support, and it respects config.filter_parameters so sensitive data stays local. We walk through the whole thing in error tracking in Rails in 2 minutes.

The Python SDK ships middleware for Django, Flask, FastAPI, and Starlette (sync and async), plus Celery, RQ, and AWS Lambda, with ContextVar scope isolation and a logging.Handler drop-in:

pip install errsight

The React/JS package has zero dependencies, is ESM-first, and gives you a drop-in that captures window.onerror and unhandledrejection:

import { init } from "errsight";
init({ apiKey: "elp_live_…", env: "production" });

And when you need raw access, the REST API takes single events or batches of up to 100, authed with an X-API-Key header, CORS enabled, with idempotency keys:

curl -X POST https://errsight.com/api/v1/events -H "X-API-Key: elp_…" -H "Content-Type: application/json" \
  -d '{"level":"error","message":"Payment failed"}'

See the full lineup on the integrations page and the docs.

Our philosophy: predictable bills, no lock-in

Two principles shaped ErrSight more than any feature.

Flat, predictable pricing. Pick a tier, know your bill. No overage tiers, no asterisks, no “contact sales for volume.” When you approach your monthly event limit, ErrSight notifies you rather than silently dropping data, then you upgrade or grab an add-on pack. The contrast with usage-and-quota billing is the whole point, and we lay out the full model honestly on our pricing page.

And there’s a free tier at $0/month, forever, with no credit card, genuinely useful for side projects and small apps, not a teaser that expires.

No lock-in by design. ErrSight is open source under AGPLv3. The OSS edition runs the same ingestion, fingerprinting, real-time logs, and alerting engine as the SaaS, just with billing and quotas stripped out. Self-host it on Docker Compose plus Postgres. Your data, your call.

Try it today

You can be watching live errors in the next two minutes. Spin up the free tier on errsight.com (no credit card, no sales call) and see every error before your users do. When you’re ready to dig in, the pricing is right there in the open, exactly where pricing should be.

Error tracking in Rails in 2 minutes with ErrSight

2026-06-06T10:00:00+05:30

Rails error tracking should not be a weekend project. With the errsight gem you add one dependency, set one environment variable, and write a four-line initializer. Then every exception, routing error, and log line shows up in real time.

This is a practical how-to. By the end you will have production-grade Rails error tracking wired up, a test error confirmed in the live log viewer, and rich user context attached to every event. Total hands-on time: under two minutes.

Step 1: Add the gem

Drop the gem into your Gemfile:

# Gemfile
gem "errsight"

Then install it:

bundle install

That is the only dependency you add. The gem hooks into Rails through standard middleware and Rails.logger, so there is nothing to patch and nothing to monkey-fix later.

Step 2: Grab an API key and set the environment

Create a project in ErrSight and copy its API key. Keys look like elp_live_… for production traffic (generic keys use the elp_… prefix). Keep the key out of source control by setting it as an environment variable:

export ERRSIGHT_KEY="elp_live_your_key_here"

In development you would put this in your .env, your shell profile, or your secrets manager. In production, set it through your platform’s config (Heroku config vars, a Kubernetes secret, your CI/CD environment, wherever you already manage env). The point of ErrSight is that you ship where you already ship; the key travels with the rest of your environment.

Step 3: The initializer

Create a single initializer. This is the whole configuration:

# config/initializers/errsight.rb
Errsight.configure { |c| c.api_key = ENV["ERRSIGHT_KEY"] }

Boot your app. That is the full setup: one gem, one env var, one initializer. You are now tracking errors.

Step 4: What you get for free

The default install is deliberately generous. Without writing another line, the gem:

Captures every Rails exception and routing error automatically, including the 404s and ActionController::RoutingErrors that usually slip past.
Broadcasts from Rails.logger at all levels (debug through fatal) into a terminal-style live log viewer with infinite scroll and keyword search.
Attaches request context on every event: the request URL, the controller and action, request metadata, and the current user.
Speaks Devise and ActiveAdmin out of the box, with first-class support, so the authenticated user and admin context come through without glue code.
Respects config.filter_parameters: anything you already mask (passwords, tokens, card numbers) stays local and never leaves your app.

That last point matters: ErrSight reuses the parameter filtering you have already configured, so sensitive fields are redacted before they are ever sent.

Duplicate exceptions are grouped through fingerprinting, so a failing query that fires on every request in a hot loop becomes one actionable issue you can triage (mark, assign, or snooze) instead of a wall of identical notifications.

Step 5: Trigger a test error to verify

Let’s confirm it works. Add a throwaway route and action that raises:

# config/routes.rb
get "/boom", to: "diagnostics#boom"

# app/controllers/diagnostics_controller.rb
class DiagnosticsController < ApplicationController
  def boom
    raise "ErrSight smoke test. If you can read this in the log viewer, it works"
  end
end

Start the server and hit the route:

bin/rails server
curl -i http://localhost:3000/boom

Open the live log viewer in your ErrSight dashboard. Within a moment you will see the exception, grouped into an issue, complete with the stack trace, the /boom URL, the DiagnosticsController#boom action, and (if a user is signed in) their identity. Search the log viewer for smoke test to jump straight to the broadcast line.

Once you have confirmed it, delete the route and controller. They were only there to prove the wiring.

Step 6: Adding richer user context

You already get the current user automatically (id, email, session, plan). When you want to attach more (a tenant id, a feature-flag cohort, the plan tier at the moment of failure), set it explicitly. A common pattern is a before_action:

# app/controllers/application_controller.rb
class ApplicationController < ActionController::Base
  before_action :set_errsight_context

  private

  def set_errsight_context
    return unless current_user

    Errsight.set_user(
      id:    current_user.id,
      email: current_user.email,
      plan:  current_user.plan,
      org:   current_user.organization_id
    )
  end
end

Now every event captured during that request carries the context you care about. When a 500 lands at 3 a.m., you already know who hit it and what plan they were on. No log spelunking required.

Is this safe for production?

Yes: that is the whole design. Events are batched on a background thread and flushed on a timer, so the work never touches your request cycle. The async overhead per request is sub-millisecond, and nothing blocks the response. On process exit the buffer is flushed, so you do not drop the errors that happen during a deploy or a crash.

Concern	How ErrSight handles it
Request latency	Sub-millisecond async overhead; nothing blocks the request
Throughput	Events batched on a separate thread, flushed on a timer
Dropped events	Buffer flushed on process exit, nothing lost at shutdown
Sensitive data	`config.filter_parameters` respected; filtered fields stay local
Noise	Duplicate exceptions grouped via fingerprinting into one issue

In other words, you can leave it on at full volume in production without worrying that error tracking is the thing that slows down your app.

Where to go next

That is Rails error tracking in two minutes: one gem, one key, one initializer, and a live view of every exception and log line. From here you can:

Browse the ErrSight documentation for advanced configuration and triage workflows.
See the other SDKs and platforms on the integrations page: Python, Rust, React/JavaScript, and React Native are shipping today, with Node, Go, PHP/Laravel, and Elixir on the way.
Read why we built ErrSight for the philosophy behind native logs plus exception tracking.
Review the pricing: flat monthly tiers, no overage fees.

Want to self-host instead? The engine is open source under AGPLv3 at ErrSight-OSS: same ingestion, fingerprinting, and alerting, no lock-in by design.

Start tracking today

Spin up a free project ($0/month, forever, no credit card) and point your Rails app at it. Add the gem, set ERRSIGHT_KEY, and watch your first error land in the live viewer. Get started at errsight.com.

Building an Error Monitoring Tool Without Pricing Overages

2026-05-26T10:00:00+05:30

Picture the worst version of a Tuesday. You ship a deploy, a downstream API starts timing out, and your retry logic turns one failure into forty. A single broken code path is now throwing the same exception in a hot loop. By the time you have rolled back, your app has emitted two million error events in ninety minutes.

Your error monitoring tool ingested every single one of them. It was very good at its job.

Then, a few days later, the second incident arrives: an invoice. The $26/month plan you signed up for has quietly become a $390 bill, because you blew through your included event volume and the meter kept running at some fraction of a cent per event. Nobody asked you. Nobody could ask you, because the events arrived faster than any human could approve them.

This is the part of usage-based monitoring that I find genuinely backwards. The pricing is anti-correlated with your wellbeing. The tool charges you the most at the exact moment you are already having your worst day. A traffic spike, a bad release, a noisy dependency, a retry storm: every one of these is both an operational emergency and a billing event. The product that is supposed to help you through the incident is, at the same time, metering you for the privilege.

I am building an error tracker called ErrSight, and early on I decided it would not work this way. No overage charges. Not “low overage charges,” not “overage charges with a generous buffer.” None. The ceiling on your plan is a real ceiling, not the starting line for a surprise invoice.

That turns out to be a more interesting engineering problem than it sounds, so let me walk through how it actually works.

Overage billing is a choice, not a law of physics

Before the architecture, it is worth being honest about why overage pricing is so common. It is not because it is the only way. It is because it is the easiest and most profitable way.

It is the easiest because the implementation is trivial: count what arrives, multiply by a rate, send the total at the end of the month. You never have to make a decision in the hot path. You never have to tell a customer “no.” You just let everything in and reconcile later.

It is the most profitable because the meter runs before the bill arrives. By the time the customer sees the number, the spend already happened. They cannot decline it. The asymmetry is the entire business model.

Once you see it that way, “no overages” stops being a pricing gimmick and becomes a design constraint. It means moving the decision to the front, into the request path, while the customer can still be protected. Concretely, I wrote down three rules:

When an account is out of quota, stop ingesting and say so clearly. Do not silently accept the data and invoice for it.
Make it architecturally impossible to overshoot the cap, even under a burst of concurrent requests, because bursts are exactly when this matters.
If a customer genuinely needs more capacity, make getting it a deliberate, opt-in decision with a known price, not a default that happens to them.

Everything below is in service of those three rules.

Mechanic 1: stop, do not bill

The ingestion endpoint is the front door. Before it does any real work, it asks one question: is this project allowed to ingest right now? The answer comes from a single method that collapses every “no” reason into one place.

def drop_reason
  return "ingestion_paused"        if ingestion_paused?
  return "events_over_limit"       if organization.over_events_limit?
  return "storage_limit_exceeded"  if storage_limit_exceeded?
  nil
end

If there is a reason to drop, the controller returns an HTTP 429 Too Many Requests with a machine-readable code, and that is the end of it.

when "events_over_limit"
  notify_once(@project.organization_id, "events")   # one email, debounced
  render json: {
    error: "Monthly event limit reached",
    code:  "EVENTS_LIMIT_EXCEEDED"
  }, status: :too_many_requests

Notice what is not here. There is no branch that says “over limit, so accept the event and tack it onto the overage counter.” Going over your limit is a 429, not a bigger invoice. The client SDK receives a clear, documented code (EVENTS_LIMIT_EXCEEDED) and can back off, buffer, or surface a warning in your own dashboards. The signal is honest: your data is being dropped, here is exactly why, and your bill is not moving.

The customer also gets one email when they hit the wall. Exactly one. A client hammering the endpoint while over quota could otherwise enqueue thousands of identical “you are over your limit” notifications per minute, so the notification is debounced with an atomic cache write that only the first caller in a one-hour window wins:

def notify_once(org_id, kind)
  key = "quota_notified:#{org_id}:#{kind}"
  if Rails.cache.write(key, true, unless_exist: true, expires_in: 1.hour)
    NotifyQuotaOverageJob.perform_later(org_id, kind)
  end
end

The customer gets told. The customer does not get charged.

Mechanic 2: you literally cannot overshoot the cap

Here is the part that took the most care, and it is the reason “no overages” is harder to build than overage billing.

The naive version of a quota check has a race condition that bursts will find immediately:

# WRONG: two concurrent requests can both pass this check
if organization.total_events_this_month + count <= organization.events_limit
  accept(events)
end

Imagine a project sitting at 49,950 events against a 50,000 limit, so there are 50 events of real headroom left. Two batches of 40 events arrive at the same millisecond, handled by two different Puma workers, possibly on two different replicas. Each batch on its own fits comfortably. But both workers read the same starting count of 49,950, both compute 49,950 + 40 = 49,990, both see that as under the limit, and both commit. The project lands at 50,030. The two batches were 80 events against 50 events of headroom, and the cap leaked by 30. Multiply that by a real burst across many workers and your “hard” limit leaks by thousands of events. Each leaked event is either free (you eat the cost) or billed (the customer eats it). There is no version of the leak that is fair.

A guarantee has to actually be a guarantee, so the reservation happens inside a transaction, serialized by a Postgres advisory lock keyed to the organization and billing period:

def reserve_events!(count:)
  org      = organization
  month    = org.quota_period_start
  lock_key = Zlib.crc32("errsight:quota:#{org.id}:#{month}") % 2**31

  transaction do
    # Every project in this org/period serializes through this lock,
    # so two concurrent bursts cannot both read "under limit" and
    # both commit. The lock releases automatically at transaction end.
    connection.execute("SELECT pg_advisory_xact_lock(#{lock_key})")

    current = Usage.where(organization_id: org.id, month: month).sum(:events_count)
    return false if current + count > org.events_limit   # the ceiling holds

    # reserve `count` against this month's usage, then return true
    bump_usage!(count)
    true
  end
end

pg_advisory_xact_lock gives me a mutex that lives in the database, not in any single Ruby process, which is the only place it can live if the limit is going to hold across many workers and replicas. Two bursts hitting the same account at the same instant now line up behind the lock. The first one reserves its quota and commits. The second one reads the post-commit total, sees there is no room, and gets false. The controller turns that false into a 429. The cap is exact, even at the millisecond boundary, even during the spike that an overage model would have cashed in on.

This is the trade at the heart of “no overages.” Overage billing never needs this lock, because it never needs to say no. Choosing to say no means choosing to build the machinery that can say it correctly under load.

Mechanic 3: more capacity is a decision, not an accident

A hard cap with no escape hatch is just a worse product. The point is not to punish growth, it is to make growth a choice the customer makes on purpose, with the price known in advance.

So the limit a project is actually checked against is never just the plan limit. It is the plan limit plus any capacity the customer has deliberately added:

def events_limit
  plan_record.events_limit + active_pack_event_credit
end

def active_pack_event_credit(at: Time.current)
  purchased_packs.where(status: "active")
                 .where("expires_at > ?", at)
                 .sum(:events_credit)
end

There are two ways to add capacity, and both are opt-in:

Upgrade the plan. The tiers are flat monthly prices with included volume: Free is 5,000 events a month, Pro is $29 for 50,000, Growth is $79 for 200,000, Business is $199 for 750,000. You always know what the next step costs before you take it.
Buy an add-on pack. If you are mostly fine but had one heavy month, a $9 pack adds 50,000 events and 2 GB of storage on a 30-day rolling window. It is a one-time purchase, not a recurring commitment, and it stacks if you need a few.

The crucial difference from an overage line item is when the decision happens. An overage charge is a decision the system makes for you, after the spend, that you discover on an invoice. A pack or an upgrade is a decision you make for yourself, before the spend, at a price you agreed to. Same outcome of “you needed more and you paid for more,” opposite relationship with the customer.

A second dial: capping the burn rate, not just the total

A hard monthly ceiling solves the billing problem, but go back to the retry storm from the top of this post: two million events in ninety minutes. Even with overage charges off the table, a spike like that can burn through an entire month of quota before lunch, and then ingestion is capped for the rest of the month and you are flying blind through the part of the incident that matters most. A ceiling on the total is not the same as a ceiling on the rate.

So every project also has a per-minute rate limit, and on paid plans the customer sets it themselves. The plan defines the maximum you are allowed to choose, and you pick any number underneath it. It is enforced by a fixed-window limiter that lives in Postgres rather than in process memory, because a per-worker counter cannot hold a real limit once you are running several Puma workers across replicas:

rate = IngestionRateLimiter.check!(@project, count: events_data.length)
unless rate.allowed
  response.headers["Retry-After"]       = rate.retry_after.to_s
  response.headers["X-RateLimit-Limit"] = rate.limit.to_s
  return render json: {
    error:       "Rate limit exceeded, retry in #{rate.retry_after}s",
    code:        "RATE_LIMIT_EXCEEDED",
    retry_after: rate.retry_after
  }, status: :too_many_requests
end

Now a runaway loop can spend at most the configured number of events per minute. The bad deploy still hurts, but it cannot vaporize your whole month in the first ninety minutes, and the Retry-After header tells a well-behaved SDK exactly how long to back off. The customer ends up inside two ceilings at once: the monthly total they are billed against, and the per-minute rate they chose. As a side effect, it also shields my ingestion path from a single misbehaving client, which is the first thing standing between a customer’s spike and my own infrastructure bill. Which brings me to the cost side.

The economics that let me say yes to this

There is a reason a lot of founders would call “no overages” financially reckless, and they would be right if you ignore the cost side. If your own costs scale linearly and without bound, then capping the customer’s bill while your infrastructure bill runs free is a great way to go broke on your most successful day. “No overages” only works if you have first made your costs predictable.

For an error tracker, the dominant cost driver is storage. Error events are write-heavy, append-mostly, time-ordered, and they pile up fast. So the events table is a TimescaleDB hypertable partitioned on time, with columnar compression that kicks in automatically after a week:

SELECT create_hypertable('events', 'occurred_at', migrate_data => true);

ALTER TABLE events SET (
  timescaledb.compress,
  timescaledb.compress_segmentby = 'project_id',
  timescaledb.compress_orderby   = 'occurred_at DESC, id'
);

-- Compress any chunk older than 7 days.
SELECT add_compression_policy('events', INTERVAL '7 days');

Segmenting by project_id and ordering by time means the recent, hot data stays fast to query for the dashboard, while everything older than a week gets squeezed into compressed columnar chunks. Error events compress extremely well, because they are full of repeating values: the same fingerprints, the same stack frames, the same environment strings, over and over. That repetition is exactly what columnar compression eats for breakfast.

The second lever is retention. Every plan has a retention window (7 days on Free, up to 90 on the higher tiers), and a background job prunes anything past it and re-derives usage so the numbers stay honest:

cutoff = org.retention_days.days.ago
count, bytes = EventRepository.prune_older_than!(project_id: id, cutoff: cutoff)

Compression bounds the cost of the data you keep. Retention bounds how much data you keep at all. Together they turn storage from an unbounded liability into a known, modeled number per plan. Once I can predict my cost per account, I can confidently promise a fixed price to the account.

I extended the same logic to hosting. The app runs on a platform with a hard spending cap and per-second billing, which suits a workload that is quiet most of the time and spiky during incidents. I am not going to ask customers to live with a predictable bill while I refuse to give myself one. The predictability has to go all the way down, or the promise on the pricing page is just optimism.

What this costs me, honestly

I want to be straight about the trade-offs, because “no overages” is not free for the person offering it.

I leave money on the table. Every overage charge I do not send is revenue I did not collect. The spiky months that would have been the most lucrative under metered billing are exactly the months I am choosing to cap. That is real money, and pretending otherwise would be dishonest.

A customer who hits the wall is a worse short-term outcome for them than being billed silently. Dropped events during an incident is a genuinely bad moment. I mitigate it with clear 429 codes, an immediate email, and one-click add-on packs, but the honest version is that a hard cap can bite. The bet is that being told “you are out of room, here is the button” is more respectful than being billed for data you never agreed to pay for, and that developers, of all customers, would rather have the explicit signal.

I had to build the hard version. The advisory lock, the atomic reservation, the debounced notifications, the usage reconciliation after pruning: none of that exists in a system that just counts and multiplies at month end. Saying “no” correctly is more code than never saying it at all.

I think it is worth every bit of that, because of one rule of thumb I keep coming back to:

Your billing model should never be anti-correlated with your customer’s worst day.

If the only way your pricing makes its best money is by charging customers more during their outages, their spikes, and their emergencies, then your incentives are quietly pointed away from theirs. I would rather have a model where my best day and my customer’s calm month are the same thing, and where their disaster does not show up as a line item on my invoice to them.

What a bad deploy actually costs

It helps to put real numbers on this. Take the retry storm from the top of this post: a bad deploy turns one failing code path into a flood, and your app emits roughly two million extra error events in ninety minutes. Here is how that single incident lands on the bill under two pricing models.

First, the plans. I am comparing ErrSight’s Growth tier against Sentry because the base prices line up closely. Sentry is a broader platform than ErrSight, with performance tracing, session replay, cron monitoring, and more under one roof, so treat this strictly as a comparison of error-event overage economics, not of everything the two products do.

	ErrSight Growth	Sentry Team	Sentry Business
Base price	$79 / month	$26 / month	$80 / month
Included errors	200,000 / month	50,000 / month	50,000 / month
Past the quota (default)	Ingestion stops with an HTTP `429`; the bill does not move	Pay-as-you-go keeps ingesting	Pay-as-you-go keeps ingesting
Extra capacity	Opt-in $9 pack adds 50,000 events (30-day rolling)	Raise reserved volume or on-demand budget	Raise reserved volume or on-demand budget
On-demand overage rate	none	about $0.00025 per error event*	about $0.00025 per error event*
Hard spend cap available?	Always; it is the only mode	Yes, set a pay-as-you-go budget (events then drop at the cap)	Yes, set a pay-as-you-go budget (events then drop at the cap)

*Representative published on-demand rate; recent reporting puts it around $0.00025 to $0.00029 per error event. Pricing changes, so check sentry.io/pricing for current numbers.

Now the incident itself. Assume normal traffic has already consumed the included quota, and the storm adds two million events on top of that.

The bad deploy (about 2,000,000 extra events)	ErrSight Growth	Sentry, pay-as-you-go left on (the default)
Events billed beyond quota	0 (capped)	about 2,000,000
Extra charge for the incident	$0	2,000,000 × $0.00025 ≈ $500
The month’s total	$79	about $580 (Business base plus on-demand)
What you captured during the storm	the month’s first 200,000 events, then drops	all 2,000,000

On ErrSight the storm costs nothing extra: the Growth cap holds at $79, and if you want to keep capturing through the incident you opt into a $9 pack or two on purpose. On Sentry with on-demand left on, the same storm adds roughly $500 to the month. That is about a $500 swing from one bad afternoon, and it is the kind of swing you discover after the fact, on an invoice, for traffic you did not choose.

Two honest caveats, because the comparison only means anything if it is fair:

Sentry can be capped too. If you set Sentry’s pay-as-you-go budget to $0, Sentry also stops at your reserved limit and you pay no overage, exactly like ErrSight. The real difference is the default and the failure mode it chooses: Sentry defaults to protecting your data and billing for it, ErrSight defaults to protecting your bill and shedding the surplus. ErrSight just makes the wallet-safe behavior the only mode, so there is nothing to remember to configure before the storm hits.
The dropped events are mostly duplicates. During a retry storm those two million events are overwhelmingly the same exception, which both tools fingerprint down to a single issue. So what ErrSight sheds at the cap is mostly redundant copies of something you have already seen, not two million distinct bugs. It is still a genuine trade-off, since a brand-new error raised mid-storm could be among the dropped events, and that is the cost I accept in exchange for a bill that never surprises you.

The point is not that one model is universally correct. It is that under usage-based on-demand, one bad deploy can quietly turn an $80 month into a roughly $580 one, and under a hard cap it cannot. If a predictable bill matters to you more than capturing every duplicate during an outage, the savings on your worst day are real, and they are roughly the price of the incident itself.

Wrapping up

“No overages” sounded like a marketing decision when I started. It turned into an architecture: a hard quota ceiling enforced by a database-level lock so it cannot leak under load, an honest 429 instead of a silent meter, opt-in capacity for the people who genuinely need it, and TimescaleDB compression plus retention to keep my own costs bounded enough that I can afford the promise.

If you have built quota or billing systems that try to stay on the customer’s side, I would love to hear how you handled the boundary cases. The lock-and-reserve pattern is the cleanest answer I found, but I doubt it is the only one.

If you would rather see the result than the plumbing, the tool is live at errsight.com.