Skip to content
Loading

Observability for Side Projects: What I Actually Use

Observability for Side Projects: What I Actually Use hero image

Last year I audited the monitoring setup on one of my side projects and found three overlapping tools all doing roughly the same thing. Sentry was capturing errors. The hosting platform was capturing errors. A custom webhook was also capturing errors and posting them to a Slack channel I had muted months ago.

None of them were talking to each other. I had traces in one place, logs in another, and error alerts firing from a third. When something broke, I would open four browser tabs and try to reconstruct what happened by correlating timestamps manually.

That is not observability. That is just noise with extra steps.


What Observability Actually Means at 2am

Forget the textbook definition. Here is what observability means in practice: you get woken up, something is broken, and you need to figure out what went wrong before you fully understand what went right.

The three pillars of observability (logs, metrics, traces) are only useful if you frame them around the questions they answer, not what they technically are.

Logs answer: "What happened, in what order, on which request?" They are your audit trail. Without them you are guessing.

Metrics answer: "Is this system behaving normally right now?" Error rate spiking? Response times degrading? Metrics tell you something is wrong. They rarely tell you why.

Traces answer: "Where did this specific request spend its time, and where did it fail?" For distributed systems, traces are the only way to follow a single request across multiple services. For a monolith or a small Next.js app, they are genuinely optional, at least early on.

Most side projects need good logs and error tracking first. Traces come later, when you actually have multiple moving parts.


Error Tracking: What I Use and Why

Sentry is the practical default for error tracking, and I think that reputation is earned. It does a few things really well.

Stack traces come with breadcrumbs showing what the user clicked, what network requests fired, and what console messages appeared before the crash. That context is what turns a cryptic error into something you can actually fix. Release tracking lets you see which deploy introduced a regression. The free tier is generous enough for most side projects.

But Sentry is not a full observability platform and you will confuse yourself if you treat it like one. It is an error tracker with some performance monitoring bolted on. The moment you start relying on it for traces across your whole system, the friction shows.

It also gets noisy fast. Default Sentry on a production web app will flood you with browser errors that are not your fault. ResizeObserver warnings, network failures from flaky mobile connections, third-party script exceptions. If you do not tune it, you will start ignoring the alerts, which defeats the entire purpose.

Here is the config I settled on for a Next.js app:

// sentry.client.config.ts
import * as Sentry from "@sentry/nextjs";

Sentry.init({
  dsn: process.env.NEXT_PUBLIC_SENTRY_DSN,
  tracesSampleRate: 0.1, // 10% of transactions
  environment: process.env.NODE_ENV,
  ignoreErrors: [
    "ResizeObserver loop limit exceeded",
    /^Network request failed/,
  ],
  beforeSend(event) {
    if (event.user) {
      delete event.user.email; // strip PII before sending
    }
    return event;
  },
});

A few things worth explaining here.

tracesSampleRate: 0.1 means you are capturing traces for 10% of transactions. Setting this to 1.0 in production will hurt you. On any project with real traffic, capturing every transaction balloons your Sentry bill and creates so much data that finding the useful signal becomes its own problem. Ten percent is a reasonable starting point. You can always turn it up temporarily when you are debugging something specific.

The ignoreErrors list is not cheating. Browser environments produce a lot of errors that originate from extensions, ad blockers, and OS-level network interruptions. You did not cause them and you cannot fix them. Filtering them out is the responsible thing to do.

Stripping email from event.user before send is not optional if you are handling any kind of user data. Sentry's servers are outside your infrastructure. Personal data should not leave your system in error reports. Strip PII by default and add it back intentionally if you ever have a compliance reason to include it.


The OpenTelemetry Angle

OpenTelemetry is not a product. It is a vendor-neutral instrumentation standard, a set of APIs, SDKs, and conventions that define how to produce telemetry data (traces, metrics, logs) in a way that any backend can consume.

Why does that matter? Because if you instrument your app with the Datadog SDK, you are coupled to Datadog. If you instrument with the Sentry SDK for traces, you are coupled to Sentry. OpenTelemetry lets you instrument once and route that data anywhere: Grafana Cloud, Honeycomb, a self-hosted Jaeger instance, or whatever comes next.

The honest trade-off is setup time. Sentry has a Next.js wizard that gets you started in five minutes. OpenTelemetry takes longer to configure properly. But if a project grows past one service, or if you ever want to switch backends without rewriting instrumentation code, the upfront investment pays off.

A minimal Next.js setup looks like this:

// instrumentation.ts (Next.js)
import { NodeSDK } from "@opentelemetry/sdk-node";
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http";
import { getNodeAutoInstrumentations } from "@opentelemetry/auto-instrumentations-node";

const sdk = new NodeSDK({
  traceExporter: new OTLPTraceExporter({
    url: process.env.OTEL_EXPORTER_OTLP_ENDPOINT,
  }),
  instrumentations: [getNodeAutoInstrumentations()],
});

sdk.start();

getNodeAutoInstrumentations() automatically instruments fetch, HTTP, and database calls. You get traces for free on most of the network activity in your app without writing a single span manually.

The OTLP endpoint in that config can point to anything. Change one environment variable and your traces go to a different backend. Your application code never changes. That is the whole promise of OpenTelemetry, and it works.


Logs: The Thing People Undervalue

Structured logs have saved me more times than traces. More than metrics too, honestly.

console.log("something happened") is fine during development. In production it is nearly useless. Log aggregators need structure to query logs efficiently. If your logs are unstructured strings, you can grep for keywords but you cannot filter by user ID, correlate by request ID, or aggregate by error type. You end up reading logs like a text file, which scales to approximately zero.

Structured logging is not complicated. Here is the pattern I use on most projects:

// lib/logger.ts
const log = (
  level: string,
  message: string,
  context?: Record<string, unknown>,
) => {
  console.log(
    JSON.stringify({
      level,
      message,
      timestamp: new Date().toISOString(),
      ...context,
    }),
  );
};

export const logger = {
  info: (msg: string, ctx?: Record<string, unknown>) => log("info", msg, ctx),
  warn: (msg: string, ctx?: Record<string, unknown>) => log("warn", msg, ctx),
  error: (msg: string, ctx?: Record<string, unknown>) => log("error", msg, ctx),
};

// Usage
logger.error("Payment failed", {
  userId: "u_123",
  amount: 4900,
  currency: "usd",
});

This outputs a JSON object. Every log aggregator, from Datadog to Papertrail to Vercel's built-in log drain, can parse JSON natively. You get filtering, querying, and alerting on any field without custom parsing rules.

The timestamp field should be in every log entry. Platform timestamps are unreliable when logs are batched or arrive out of order. Include your own.

The context object is where the value lives. Log the user ID, the request ID, the relevant entity IDs, the amounts, the status codes. When something breaks, you want to be able to filter logs by userId: "u_123" and see every action that user took. Without structured context, you are reading a novel trying to find one sentence.


My Actual Stack for Side Projects

I keep this simple on purpose.

Sentry for error tracking. The free tier covers anything up to moderate traffic. I configure it once, tune the ignore list after the first week of noise, and then mostly forget about it until it tells me something is broken.

Structured logs to Vercel's log drain for small projects. Vercel captures stdout and makes logs searchable. For anything with more complexity, I ship logs to Axiom on their free tier, which gives better querying. JSON in, fast queries out.

UptimeRobot (free tier) for uptime monitoring. A cron job that pings a /api/health endpoint every five minutes and sends an email if it fails. Not glamorous, but it catches the obvious things.

OpenTelemetry only when I have multiple services. A single Next.js app does not need distributed tracing. When I add a background worker, a separate API, or a third-party integration that I actually control, that is when traces start earning their complexity cost.

The principle behind this: observability is not one-size-fits-all. A weekend project with a dozen users needs basic error visibility and a way to know when it is down. A production SaaS with paying customers needs error tracking, structured logs with retention, traces across services, and alerting that someone actually responds to. Reaching for the full enterprise stack on a project with twelve users is a way to spend a weekend configuring dashboards instead of shipping features.

Match the instrumentation to the stakes.


What I Would Tell Myself Two Years Ago

Set up error tracking before you launch, not after the first incident. Sentry takes thirty minutes. Not having it when your first production bug happens costs much more than thirty minutes.

Sample your traces. Capturing everything feels thorough. In practice it creates noise, inflates costs, and makes the data less useful because you are searching through too much of it.

Strip PII from error events by default. Build the habit early. Retrofitting data handling onto a growing app is painful.

Do not alert on every error. Alerting fatigue is real. Set alerts on error rate changes, not raw error counts. If your alert fires and you start ignoring it, it is worse than no alert, because now you have false confidence that you would know if something was wrong.

And console.log in production is not a strategy. I shipped it anyway for longer than I should have. You probably will too. But the sooner you replace it with structured logging, the faster you can debug the things that actually matter.