“Data is cheap, but getting value out of data is expensive.” That statement, from CHAOSSEARCH cofounder and CEO Les Yetton, sums up the goal of the company’s flagship product: a search and analytics managed services platform for cloud storage that promises to be faster and cheaper.
The platform, which uses object storage like Amazon’s S3 as the underlying data layer, is based on the company’s own patented universal file format and compression algorithm. The algorithm supports multiple API interfaces, including REST, Elastic, Mongo and SQL. It allows users to store, search and query all data within their own Amazon S3 environment, including all historical log data.
“Essentially, CHAOSSEARCH lets you search and query data without having to manually ingest and process those different sources of data,” explained Owen Rogers, a research director at 451 Research. “So instead of having to format, normalize and process your log, JSON and CSV files, you can just upload them to AWS’s S3 object storage and CHAOSSEARCH should do a lot of the legwork to allow them to be searched and queried.”
Under the hood, CHAOSSEARCH uses a new distributed database, based on the company’s Data Edge technology. This eliminates the need to move or transform data into specific siloed databases by fully indexing data sets with built-in schema detection, normalization and compression. The Data Edge format supports structured, semi-structured, and unstructured data sources where schema is auto-detected and irregular/disparate data is normalized automatically. The new indexing technology also allows for virtually limitless scale.
The goal, says CTO Thomas Hazel, was to find a way to bypass the limitations of other information indexing algorithms. Most search and relational databases still use the same data structures and algorithms invented nearly 50 years ago, he said, but they bump up against real limits when data gets big and chaotic. Data Edge was designed to address those limits.
Because CHAOSSEARCH is offered as a fully managed service, there are no databases to provision or manage. The fact that it’s offered as a managed service also should help reduce a lot of the complexity that can be associated with cloud search, Rogers said. Users should be able to query and analyze without having to worry about how to get the sources to work together, he added.
One of the most popular uses of this technology will probably be around application and security analytics. Typically, businesses keep a limited amount of historical log data on hand, and then back it up or archive it to cheaper storage. “The problem with that is that you can’t threat hunt tape,” Yetton said. “Wouldn’t it be great to always have access to all your data, live, without having to restore backups or manually bring in other data sources that weren’t being logged?”