RSS Elastic Blog - Elasticsearch, Kibana, and ELK Stack

Safely sample production data into pre-production environments with Logstash

In a well-architected system, it's best to separate pre-production and production environments to prevent issues in one environment from affecting the other. However, maintaining separate end-to-end environments can be impractical for organizations with limited resources. A solution is presented that allows routing a random subset of data to a pre-production cluster without disrupting the data flow to the production cluster. This solution uses Logstash in combination with UDP and is a lightweight and low-risk alternative to more complex patterns. The solution involves two pipelines: a common pipeline that generates data and samples a subset of it, and a pre-production pipeline that receives the sampled data. The common pipeline uses a Ruby filter to randomly select events for inclusion in the pre-production pipeline. The selected events are sent to the pre-production pipeline via UDP, which ensures non-blocking production data flow. The pre-production pipeline adds a field to the events to indicate they were sent to the pre-production cluster. The benefits of this solution include non-blocking production flow, efficiency, and simplified architecture. This approach reduces complexity, operational risks, and performance overhead associated with managing queues.
www.elastic.co
www.elastic.co