At some point, every serious product company faces the same decision: do we keep paying $100-500k/year for Mixpanel/Amplitude, or do we build our own?
The build-vs-buy tradeoff shifts around $50-100k/year in analytics spend. Below that, a vendor is almost always right. Above it, building becomes economically viable — and gives you capabilities vendors can't match.
This is the playbook we use for internal analytics builds. 14 weeks from "we're going to do this" to a production-grade platform.
When to build
Build when:
- Your annual analytics vendor spend is > $100k and growing
- You need data ownership (compliance, privacy, sovereignty)
- You need custom queries vendors don't support
- You already have data engineering capability
- Your event volume is growing fast (costs will escalate)
Buy when:
- You're below $50k/year and stable
- You have no data engineering bandwidth
- You need advanced vendor features (session replay, heatmaps) you can't easily build
- Your team would spend the saved money on itself
What we're building
A stack with these components:
- Event ingestion — SDK + collector
- Event pipeline — streaming from collector to storage
- Storage — ClickHouse for event facts
- Transformation — dbt for derived tables (funnels, cohorts, segments)
- Query layer — BI tool + API for custom integrations
- Governance — tracking plan, schema registry, PII policies
Total headcount needed: 2-3 engineers part-time for 14 weeks.
Week 1-2: Tracking plan + schema
Before writing any code, design the event schema. This is the most important week.
- List every event you actually use today (pull top 50 from vendor)
- Identify standard context properties (user_id, session_id, platform, etc.)
- Design event-specific property schemas
- Define naming convention (see event schema mistakes post)
- Identify PII and classification policies
Deliverable: a YAML or markdown file with 20-50 events fully specified. Reviewed by the consuming teams. No code yet.
Week 3-4: Ingestion collector
Build the collector that receives events from clients.
Stack
- Language: Go (our default for high-throughput networking)
- Framework: plain net/http or chi
- Deploy: Kubernetes, 3+ replicas behind ALB
- Protocol: HTTPS, JSON body, API key auth
Core responsibilities
- validate API keys and source identity
- enforce tracking-plan schema contracts
- enrich events with ingestion metadata
- handle idempotency and retries safely
- route bad payloads to a quarantine stream
Do not skip schema enforcement. Garbage accepted at ingestion always becomes expensive downstream.
Week 5-6: Streaming and storage
Once ingestion is reliable, pipe events into analytical storage.
Recommended baseline:
- Kafka or Redpanda as transport
- ClickHouse
MergeTreefor event facts - partitioning by event date and optional tenant key
- sort keys aligned to real query patterns (for example:
tenant_id, event_name, timestamp)
Key delivery goals:
- predictable ingestion latency
- replay-safe consumer design
- clear dead-letter handling and runbooks
Week 7-8: Transformations and semantic model
This phase is where trust is built.
- implement dbt models for canonical metrics (DAU, activation, retention, conversion)
- create tested dimensions and fact tables for common product questions
- standardize metric definitions with explicit owners
- publish docs for each core metric and model
If teams disagree on definitions, adoption stalls even if the platform is technically sound.
Week 9-10: Dashboard parity migration
Move existing critical dashboards with side-by-side validation.
Migration checklist:
- identify top 10 executive and product dashboards
- validate number parity within agreed tolerance
- document known intentional differences
- decommission old dashboards only after stakeholder signoff
This is a change-management phase, not just a technical phase.
Week 11-12: Self-serve and access governance
Enable analytics consumers without compromising safety.
- role-based access (analyst, PM, exec, support)
- PII masking and sensitive-table controls
- query cost guardrails for ad-hoc exploration
- analyst starter templates for common analyses
A platform that only data engineers can use will not deliver ROI.
Week 13-14: Operational hardening
Finalize production readiness:
- freshness SLOs and lag alerts
- schema drift detection
- backup and restore drills
- ownership map and on-call runbooks
- roadmap for incremental enhancements
At this stage, the platform should be a maintained product, not a one-off project.
Build vs buy: practical decision rule
Use this as a quick decision heuristic:
- stay with vendor when annual spend is low and needs are standard.
- build internally when spend is high, governance requirements are strict, or custom analyses are core to product advantage.
- run hybrid when vendor supports long-tail users but internal platform serves high-value workloads.
The right answer can change as your volume and organization maturity evolve.
Common mistakes to avoid
- migrating everything at once
- delaying schema governance until after ingestion
- treating dashboard parity as optional
- underestimating internal enablement and documentation needs
- no clear platform ownership after launch
Closing
Internal analytics platforms succeed when teams treat delivery as both engineering and organizational adoption. The technical stack matters, but metric trust, ownership, and enablement are what create long-term leverage.
Related resources
- Capabilities: Product Analytics and Data Platform
- Case study: Consumer app analytics platform
- Deep dive: ClickHouse vs Postgres analytics
Tags
Anthra AI Team
Engineering Team
Collective posts from the engineers at Anthra AI. We write about what we build.
More posts by Anthra AI TeamShare this article
Get insights like this weekly
Product engineering notes on AI, data, and infrastructure - no fluff.
Previous post
Edge or origin? A decision framework for latency-sensitive features
Infra
Next post
Fine-tuning LLMs in 2026: when it's worth the effort
AI