Observability
Observability feature offers a comprehensive suite of tools designed to enhance visibility and insight across your application's performance and usage. With a focus on user-friendliness, the standout aspect of our observability suite is its ease of setup—every feature can be enabled with just a click of a button, streamlining the integration of advanced monitoring capabilities into your workflow.
The ComUnity Platform provides comprehensive observability tools that help you understand your application's health, performance, and user experience. With integrated monitoring, logging, and analytics, you can quickly identify issues, optimise performance, and understand how users interact with your applications.
What makes our observability different: Everything is integrated out-of-the-box. Enable observability once, and you immediately get metrics dashboards, log search, distributed tracing, and user analytics—all connected and correlated for faster troubleshooting.
The Three Pillars of Observability
Observability in the ComUnity Platform is built on three complementary data sources that work together to give you complete visibility:
Metrics: System Performance
Monitor your application's health with real-time performance data.
What you'll track:
Request rates, error rates, and latency (P99, P95)
Resource usage (CPU, memory, database connections)
Service health and availability
Custom business metrics
When to use: Daily health checks, performance optimisation, capacity planning
Logs: Detailed Event Records
Search and analyse detailed logs to understand what happened and why.
What you'll track:
Error messages and stack traces
User actions and system events
API requests and responses
Application behaviour and debugging info
When to use: Troubleshooting errors, debugging issues, audit trails
Traces: Request Flow Visualisation
Follow individual requests through your entire system to identify bottlenecks and failures.
What you'll track:
End-to-end request flows across services
Duration of each operation
Service dependencies and call chains
Performance bottlenecks in distributed systems
When to use: Debugging slow requests, understanding service interactions, optimizing workflows
Client Analytics: User Behaviour
Understand how users interact with your application through privacy-first analytics.
What you'll track:
User engagement and session duration
Most visited pages and features
Geographic distribution and device types
User flows and drop-off points
When to use: Feature adoption analysis, UX optimisation, understanding user behaviour
How These Tools Work Together
The power of the ComUnity Platform's observability comes from how these tools integrate:
Scenario: Users report slow page loads
Start with Metrics → Notice P99 latency spike at 14:30
Check Logs → Find error messages during that time period
View Traces → See complete request flow and identify slow database query
Correlate → All three tools share trace IDs for seamless navigation
Every log entry includes a trace ID. Every trace links to logs. Metrics dashboards link to both. This correlation eliminates the need to manually piece together data from different systems.
Quick Start Guide
Step 1: Access Observability
Observability is segmented per environment, with Metrics, Traces, and Logs supported by default in each active environment. Client Analytics requires manual enablement. Follow the steps below to enable Client Analytics for each deployment environment.
Enabling Client Analytics
Prerequisites:
Access to the ComUnity Developer Toolkit
Project with at least one active environment
Steps:
Log into the Toolkit with your credentials
Open your project from the project list
Navigate to Observability in the main menu
Go to Project Settings > Environment tab> Observability tab:

Click "Activate Page Analytics" and wait for the background process to complete
Access your dashboards from the Observability menu
Client Analytics is enabled for the selected environment. Your observability dashboards are now accessible from the Observability menu.
You'll need to enable Client Analytics separately for Development, QA, and Production environments.
Time to enable: Approximately 2-3 minutes per environment
Step 2: Access Your Dashboards
Once enabled, you can access four integrated dashboards:



Metrics Dashboard
View real-time performance data
Monitor service health
Track custom metrics
Create alerts for issues
Logs Search
Search error messages
Filter by time and service
Find trace IDs for correlation
Debug production issues
Traces Viewer
Visualize request flows
Identify bottlenecks
Debug slow operations
Understand service dependencies
Client Analytics
Track user engagement
Understand feature adoption
Analyse user flows
Monitor traffic sources
Step 3: Instrument Your Application (Optional)
The platform automatically collects infrastructure metrics, logs, and traces. For deeper insights, you can add custom instrumentation:
Add custom metrics for business logic (e.g., payment success rate, user signups) Add structured logging for better searchability Add custom trace spans for specific operations
Learn about instrumentation → (coming soon)
Common Use Cases
Use Case 1: Troubleshooting Production Errors
Problem: Users reporting "payment failed" errors
Investigation workflow:
Logs: Search for "payment failed" errors
Find trace ID in the error log entry
Traces: Open the trace to see full request flow
Identify: Payment gateway timeout after 30 seconds
Action: Increase timeout or add retry logic
Time to resolution: Minutes instead of hours
Use Case 2: Optimising Slow Endpoints
Problem: API endpoint taking 5 seconds (should be under 500ms)
Investigation workflow:
Metrics: Notice P99 latency spike in dashboard
Traces: Filter to slow requests (>4 seconds)
Identify: Database query taking 4.8 out of 5 seconds
Logs: Find the actual SQL query in log details
Action: Add database index or optimise query
Result: 10x performance improvement
Use Case 3: Understanding Feature Adoption
Problem: New feature launched but unsure if users are using it
Investigation workflow:
Client Analytics: Check page visits to feature screen
Compare: Feature page visits vs total visits = adoption rate
Analyze: Check bounce rate and time on page
Result: 15% adoption, high bounce rate = users trying but not engaging
Action: Improve feature onboarding
Insight: Data-driven feature development
Use Case 4: Capacity Planning
Problem: Need to prepare for traffic surge during campaign
Investigation workflow:
Metrics: Review historical peak traffic patterns
Identify: Current capacity handles 1,000 req/sec
Calculate: Expected campaign traffic is 3,000 req/sec
Traces: Check if any services have bottlenecks under load
Action: Scale infrastructure proactively
Result: Zero downtime during campaign
Getting Help
Documentation
Metrics Guide - Understanding dashboards and creating alerts
Logs Guide - Searching logs and debugging with LogQL
Traces Guide - Reading trace visualisations and finding bottlenecks
Client Analytics Guide - Understanding user behavior and analytics
Troubleshooting Guide - Common issues and solutions (coming soon)
Quick Reference - Query cheat sheets and glossary (coming soon)
Best Practices
For Daily Monitoring
Check metrics dashboards daily
Review error rates and latency
Look for unusual patterns
Verify no alerts are firing
Use logs for investigation
Start with time period when issue occurred
Search for errors or specific events
Follow trace IDs to see full context
Review analytics weekly
Track user engagement trends
Identify popular features
Monitor mobile vs desktop usage
For Troubleshooting
Follow the investigation pattern:
Metrics → Identify when problem started
Logs → Find specific error messages
Traces → See complete request flow
Correlate → Use trace IDs to connect data
Ask the right questions:
What happened? (Logs)
When did it happen? (Metrics)
Where in the system? (Traces)
Who was affected? (Analytics)
For Performance Optimisation
Focus on user-facing metrics first:
P99 latency (worst-case user experience)
Error rate (user frustration)
Page load time (user engagement)
Use traces to find bottlenecks:
Identify longest operations
Optimise database queries
Cache expensive operations
Consider async processing
Measure the impact:
Compare before/after metrics
Check if user engagement improved
Verify error rates decreased
Security and Privacy
Data Ownership
All observability data stays on your infrastructure. No data is sent to third-party services.
Privacy Compliance
GDPR compliant: IP anonymisation, consent management
CCPA compliant: User opt-out and data deletion
HIPAA compatible: Can be configured for healthcare applications
Access Control
Observability data access is controlled through ComUnity Platform permissions. Users only see data for projects and environments they have access to.
Technical Details
Technology Stack
Metrics: Prometheus + Grafana + Thanos
Logs: Loki with LogQL query language
Traces: Jaeger + OpenTelemetry + Tempo
Analytics: Matomo (open-source, privacy-first)
Data Retention
Metrics: High resolution for 30 days, downsampled for 1 year
Logs: 30 days by default (configurable)
Traces: Sampled storage for 30 days
Analytics: Unlimited retention
Performance Impact
Minimal overhead on applications (< 1% CPU, < 50MB memory)
Automatic sampling for traces reduces data volume
Asynchronous logging prevents blocking
Next Steps
Just Enabled Observability?
Understand Metrics → Start with your service health dashboard
Learn Log Searching → Find and debug errors quickly
Explore Traces → Visualise request flows
Ready for Advanced Features?
Set Up Alerts → Get notified when issues occur (coming soon)
Add Custom Instrumentation → Track business metrics (coming soon)
Need Help?
Review the troubleshooting guides for common issues
Contact support through support@comunityplatform.com
Check the quick reference for query syntax
Last updated