Best practices for data search
We've collected a number of tips to help you optimize performance and get the most out of what Devo data search offers.
When dealing with large amounts of data, you need to consider browser's memory restrictions and the processing requirements of different query operations. We have several recommendations to make sure you get the best possible performance:
- Manage browser memory
- Consider cardinality when grouping events
- Reduce the number of columns in a table
Manage browser memory
Restart your browser to free up memory.
Minimize the number of open tabs to maximize available memory.
Limit concurrent queries
As a general rule, you should minimize the number of concurrent queries in order to maximize available memory.
If you need to have multiple, large queries open, create a second session by opening another browser window in incognito mode to better handle the memory requirements.
Use a brief time range when building a new query
Query-building can sometimes involve quite a bit of trial and error with the operations you apply to the data. So before starting to build and refine your query, apply a briefer time period and make sure that real-time event flow is off. This provides better performance because there will be fewer events to apply the operations to when you apply filters, create columns, and so on. Once you are satisfed with your query, you can set the time range and real-time event flow as you require.
Follow the order of operations
All filters should be applied early in the query, and certainly before grouping and aggregation. This reduces the memory and computation required for the later operations. The following describes the recommended order of operations:
- Create columns (data enrichment)
- Filters of new columns
Consider cardinality when grouping events
Avoid grouping by fields with a very large number of different values (high cardinality). This can be resource-intensive and produce results that are harder to read and analyze. Here are some tips for grouping by fields with high cardinality:
- Consider applying a filter to the field before grouping to limit the cardinality.
- If the field contains numeric values, enrich the data with a new column that identifies a numeric range to which the event belongs, then group by the numeric range instead of the individual values.
Reduce the number of columns in a table
If you use a Finder to open a data table, you can pre-select only the columns with data that is of interest to you. This reduces the amount of data that your browser needs to load into memory. Here's how.
If you have a query already open in the search window, you can use the Search Column Layout tool to pick the columns you want to show or hide. Here's how.
Tips and bonus features
There are some great tools available in the search window that you might overlook. Here we list a few that can really come in handy.
Toggle Execution Info
Find this option in the Additional tools → Query info menu. This gives you useful information about the current query and can tell you:
- How many rows the query has in total, and how many have been loaded so far.
- How much memory is currently being used, and what is the maximum memory you will be able to use.
View selected events
Sometimes it can be visually difficult to examine an event's data, especially when the number of columns necessitate a horizontal scrollbar. This is when this tool comes in really handy.
Just click to select the event or events that you want to examine more particularly, then select the View Selected Events tool on the toolbar. If it's not in the toolbar, you'll find it in the Additional Tools menu.
This window lists each event's fields and values on its own page so that you can thoroughly examine the events one at a time. The Rich views toggle is activated by default and when activated, correctly reads and displays fields with values in JSON format. You can also copy an event's data to paste it elsewhere, or download the event in CSV, JSON, or TXT formats.
Time Interval History
When building and running queries to generate a periodic report, you may be working in multiple data tables concurrently and applying the same time range to each table.
This feature offers you a simple way to apply a recently-used time range to other queries without having to repeatedly use the time range selector. Read more about it here.