timberio/vector
A lightweight and ultra-fast tool for building observability pipelines
repo name | timberio/vector |
repo link | https://github.com/timberio/vector |
homepage | https://vector.dev |
language | Rust |
size (curr.) | 14603 kB |
stars (curr.) | 3522 |
created | 2018-08-27 |
license | Apache License 2.0 |
What is Vector?
Vector is a lightweight and ultra-fast tool for building observability pipelines. Compared to Logstash and friends, Vector improves throughput by ~10X while significanly reducing CPU and memory usage.
Principles
- Reliability First. - Built in Rust, Vector’s primary design goal is reliability.
- One Tool. All Data. - One simple tool gets your logs, metrics, and traces (coming soon) from A to B.
- Single Responsibility. - Vector is a data router, it does not plan to become a distributed processing framework.
Who should use Vector?
- You SHOULD use Vector to replace Logstash, Fluent*, Telegraf, Beats, or similar tools.
- You SHOULD use Vector as an agent or sidecar.
- You SHOULD use Vector as a Kafka consumer/producer for observability data.
- You SHOULD use Vector in resource constrained environments (such as devices).
- You SHOULD NOT use Vector if you need an advanced distributed stream processing framework.
- You SHOULD NOT use Vector to replace Kafka. Vector is designed to work with Kafka!
- You SHOULD NOT use Vector for non-observability data such as analytics data.
Community
- Vector is downloaded over 100,000 times per day.
- Vector’s largest user processes over 10TB daily.
- Vector is used by multiple fortune 500 companies with stringent production requirements.
- Vector has over 15 active contributors and growing.
Documentation
About
Setup
- Installation - containers, operating systems, package managers, from archives, from source
- Configuration
- Deployment - topologies, roles
- Guides - getting started, unit testing
Reference
- Sources - docker, file, http, journald, kafka, socket, and 7 more…
- Transforms - json_parser, log_to_metric, logfmt_parser, lua, regex_parser, sampler, and 18 more…
- Sinks - aws_cloudwatch_logs, aws_s3, clickhouse, elasticsearch, gcp_cloud_storage, gcp_pubsub, and 23 more…
Administration
Resources
- Community - chat, @vectordotdev, mailing list
- Releases - v0.8.2 (latest)
- Roadmap - vote on new features
- Policies - Security, Privacy, Code of Conduct
Performance
The following performance tests demonstrate baseline performance between common protocols with the exception of the Regex Parsing test.
Test | Vector | Filebeat | FluentBit | FluentD | Logstash | SplunkUF | SplunkHF |
---|---|---|---|---|---|---|---|
TCP to Blackhole | 86mib/s | n/a | 64.4mib/s | 27.7mib/s | 40.6mib/s | n/a | n/a |
File to TCP | 76.7mib/s | 7.8mib/s | 35mib/s | 26.1mib/s | 3.1mib/s | 40.1mib/s | 39mib/s |
Regex Parsing | 13.2mib/s | n/a | 20.5mib/s | 2.6mib/s | 4.6mib/s | n/a | 7.8mib/s |
TCP to HTTP | 26.7mib/s | n/a | 19.6mib/s | <1mib/s | 2.7mib/s | n/a | n/a |
TCP to TCP | 69.9mib/s | 5mib/s | 67.1mib/s | 3.9mib/s | 10mib/s | 70.4mib/s | 7.6mib/s |
To learn more about our performance tests, please see the Vector test harness.
Correctness
The following correctness tests are not exhaustive, but they demonstrate fundamental differences in quality and attention to detail:
Test | Vector | Filebeat | FluentBit | FluentD | Logstash | Splunk UF | Splunk HF |
---|---|---|---|---|---|---|---|
Disk Buffer Persistence | ✅ | ✅ | ❌ | ❌ | ⚠️ | ✅ | ✅ |
File Rotate (create) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
File Rotate (copytruncate) | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ | ✅ |
File Truncation | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Process (SIGHUP) | ✅ | ❌ | ❌ | ❌ | ⚠️ | ✅ | ✅ |
JSON (wrapped) | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ |
To learn more about our correctness tests, please see the Vector test harness.
The Little Details
Data Model
- All data types - Logs, metrics, and traces (coming soon).
- Customizable log schema - Change Vector’s log schema to anything you like.
- Rich type system - Support for JSON primitive types and timestamps.
- Metrics interoperability - A sophisticated metrics data model ensures correct interoperability between systems.
- Metrics aggregation - Aggregated histgorams and summaries reduce volume without loss of precision.
Control Flow
- Pipelining - A directed acyclic graph processing model allows for flexible topologies.
- Control-flow - Transforms like the
swimlanes
transform allow for complex control-flow logic. - Dynamic partitioning - Create dynamic partitions on the fly with Vector’s templating syntax.
Data Processing
- Programmable transforms - Lua, Javascript (coming soon), and WASM (coming soon) transforms.
- Rich parsing - Regex, Grok, and more allow for rich parsing.
- Context enrichment - Enrich data with environment context.
- Metrics derivation - Derive logs from metrics.
- Multi-line merging - Merge multi-line logs into one event, such as stacktraces.
Operations
- Hot reload - Reload configuration on the fly without disrupting data flow.
- Zero delay start - Starts and restarts without a delay.
- Multi-platform - Linux, MacOS, Windows, x86_64, ARM64, and ARMv7.
- CI friendly - Config linting, dry runs, and unit tests make Vector CI friendly.
- Configurable concurrency - All CPU cores (service) or just one (agent) via the
--threads
flag. - Custom DNS - Custom DNS makes service discovery possible.
- Optional static binary - Optional MUSL static binaries mean zero required dependencies.
- TLS support - All relevant Vector components offer TLS options for secure communication.
Reliability
- Memory safety - Vector is built in Rust and is memory safe, avoiding a large class of memory related errors.
- Decoupled buffer design - Buffers are per-sink; a bad sink won’t bring the entire pipeline to a halt.
- Intelligent retries - A fibonacci backoff algorithsm with jitter makes Vector a good citizen during outages.
- Backpressure & load shedding - Buffers can be configured to provide backpressure or shed load.
- Rate-limited internal logging - Vector’s internal logging is rate-limited avoiding IO saturation if errors occur.
- Sink healthchecks - Healthchecks provide startup safety and prevent deploys with bad configuration.
- Robust disk buffering - Vector uses
leveldb
for robust data durability across restarts.
UX
- Clear Guarantees - A guarantee support matrix helps you make the appropriate tradeoffs with components.
- Config unit tests - Develop Vector config files like code. Avoid the frustrating dev style required by other tools.
- Config linting - Quickly lint Vector config files to spot errors and prevent bad configs in CI.
- Thoughtful docs - Quality documentation that respects your time and reduces communication overhead.
Installation
Run the following in your terminal, then follow the on-screen instructions.
curl --proto '=https' --tlsv1.2 -sSf https://sh.vector.dev | sh
Or use your own preferred method.
Latest Posts & Announcements
- Prometheus Source
- EC2 Metadata Enrichments
- Alpha Kubernetes Source
- Use Custom DNS Servers
- Unit Testing Your Vector Config Files