How Devo works
Here's a high-level view of how data flows into Devo, the principal internal components responsible for managing and storing data, and how data is queried and retrieved.
Ingesting and storing data
Devo ingests data sent from any number of varied data sources. These data sources can be configured to send events directly to Devo if they are capable of applying the necessary Devo tag to their events and establishing a secure channel. Otherwise, they can send their events to the Devo relay. The relay, installed within the customer's secure network, can apply rules to associate Devo tags to the inbound events it receives, then compress and forward them to Devo over a secure, encrypted channel.
Devo's event load balancer receives the events, decrypts the data, and distributes them across the available data nodes. There is no ingest delay caused by indexing or data parsing. That's because indexing occurs at frequent fixed intervals and not before saving the data. As data is always saved in the format in which it was received, parsing occurs only at query time. Each data node contains a collector that receives the event and saves it in a file that resides within a directory defined, in parts, by the Devo domain name, date, and the event's Devo tag. The file is kept open to accept subsequent events from the same data source, then every 24 hours the file is archived and a new one created to accept the next 24 hours of events from that same data source. All event data in the data nodes is compressed at a ratio of 10:1.
There are essentially two ways to query data stored in Devo:
- The Devo web application is the main tool used for accessing and querying data. Data is primarily accessed on-screen but can also be downloaded in raw format from within the search window. Read more about querying data in the Devo UI.
- The Devo REST API offers programmatic access to data in Devo and is even capable of forwarding query results to other data storage platforms like Amazon S3, Apache Kafka, or Hadoop. Read more about the REST API.
All queries are sent to Devo's meta node, which acts as a balancer and distributes the query across the data nodes. The query is then executed in each data node and the results are returned to the meta node, where they are collated. In the case of queries made using the web application, the meta node uses an algorithm to first return those events of the highest relevance. This is done to avoid overloading the browser's memory.
Key platform benefits
Devo's architecture was designed to deliver some important overall benefits as a data operations solution.
- Devo scales on every component of the architecture to ensure optimal performance.
- Scalability is automatic in SaaS deployments in public cloud environments. In on-premise environments, scale-up by adding more data nodes and meta nodes.
- There's no limit to the number of data sources that can send data to your Devo deployment.
- No requirements for resources to build, maintain, and update large, unwieldy indexes.
- Storage is optimized by compressing data at a ratio of 10:1 on average, making better use of the space available.
- The Devo relay can be used to filter out unwanted events to optimize ingestion value.
- The unique practice of storing data in directories that identify customer domains, dates, and event tag information streamlines and speeds data retrieval.
- Even as ingest volumes increase, performance remains blazingly fast. As real-time as you could want.
- Real-time data compressing and decompressing.