IoT
10 min

A startup that disrupted the IoT market: crate.io

November 4, 2021
A startup that disrupted the IoT market: crate.io
Main clouds hero section image

In this article we will introduce you to an IoT startup that managed to attract various rounds of investments, a large clientele, as well as the successful establishment on the international market in a short time.

We will talk about the company "CRATE Technology" ( crate.io ), more precisely about the product CrateDB, which is actively used in the IoT sphere. About this and much more you can learn from this article, as well as find out how suitable this solution is for your project or company.

Introduction

CrateDB is a distributed SQL (Structured Query Language) database management system that integrates a fully searchable document-oriented database. It is open source, written in Java, based on a non-shared architecture and designed for high scalability and includes components from Facebook Presto, Apache Lucene, Elasticsearch and Netty.

History

The CrateDB project was started by Jodok Batlogg, an open source author and creator who contributed to OSIV (Open Source Initiative Vorarlber), and at Lovely Systems in Dornbirn. The software is an open source cluster database used for rapid text search and analytics. The company, now called Crate.io, raised its first round of funding in April 2014, a $4 million round in March 2016 and $2.5 million in January 2017 from Dawn Capital, Draper Esprit, Speedinvest and Sunstone Capital.

In June 2014, Crate.io won a Jury's Choice nomination at the GigaOm Structure Launchpad competition, and in October 2014 they won TechCrunch Disrupt Europe in London.

CrateDB 1.0 was released in December 2016 and reportedly had over a million downloads. CrateDB 2.0 and Enterprise Edition were released in May 2017

Review

The CrateDB language is SQL, but it takes a document-oriented, NoSQL-style approach to the database for documents. The software uses Facebook Presto's SQL parser, proprietary query analysis, and a distributed query engine. Elasticsearch and Lucene are used to define the transport protocol and cluster, and Netty for asynchronous network application environments.

CrateDB offers automatic data replication and self-healing clusters for high availability.

CrateDB includes a built-in administration interface. Its command line interface (Crate Shell - CraSh) allows interactive queries. Its Python client is the most advanced and has SQLAlchemy integration.

In June 2016, Kyle Kingsbury tested the concurrency and consistency of CrateDB 0.54 to identify several fault tolerance issues due to Elasticsearch dependencies. He does not recommend Crate as the primary repository if every record really matters, but keeps the records in a separate database and uses Crate for quick queries. He presented his results again in April 2017 during a keynote presentation at the Scala Days conference.

Information Security

Protecting IT systems from cybersecurity threats is one of CrateDB's most popular applications. CrateDB is a cybersecurity database that integrates a real-time SQL engine built on NoSQL. This gives the scalability, performance, and analytical flexibility of NoSQL DBMS without sacrificing the ease of use and integration of SQL.

CrateDB allows SQL developers to process logs and network traffic in real-time or at high volumes to support a wide range of cybersecurity use cases. Here are a few of the cybersecurity systems that are powered by CrateDB:

• Skyhigh Networks - Cloud Services Access Security Broker (CASB)
• StackRox - adaptive threat protection for containers
• Kryptos Logic - threat and threat protection

Solutions like these are included in the Cyber Security CrateDB database features:

• Handling multiple data points per second

Elastic scaling allows CrateDB to receive data at high speeds on clusters of low-cost commodity servers

• Real-time query speeds

Columnar indexes, field caches provide in-memory SQL performance in network or log data streams

• Text search, IP fields, AI, time series

Dynamic schemas and optimization to handle a wide range of data and cybersecurity analytics

• Continuous inclusion

Built-in data replication, data distribution and cluster balancing provide nonstop threat detection and protection

• ANSI SQL


Easy to use for any developer and integrates with standard data sources and visualization

How the Thomas Concrete Group is adding value to its customers with IoT and  CrateDB (part II)


CrateDB comparison

CrateDB is a distributed SQL database based on the foundation of NoSQL (for storage, indexing and networking), and it's best if you need it:

• Handling large amounts of data - millions of inserts per second
• Query versatility - real-time, time series, geospatial, text search, AI
• Data versatility - dynamic patterns of structured or unstructured data
• SQL - for ease of use and integration without locking
• Easy scalability - easy to create a database to handle more data or users

Real-Time SQL Is What's Making CrateDB Such A Hit In The IoT Market

Key Features

Scalable

Database expansion should be simple, then easy to implement with CrateDB. Automatic data rebalancing and a non-shared architecture make it easy to scale. Just add new machines to create and grow your CrateDB cluster. No need to know how to redistribute data in the cluster, because CrateDB does it for you.


Distributed SQL queries, accumulation and search

CrateDB's distributed SQL query engine includes columnar field caches and a more advanced query scheduler. This gives CrateDB the unique ability to perform aggregations, JOINs, subsets, and ad hoc queries at memory speed. CrateDB also integrates built-in full-text search functions that allow you to store and query structured or unstructured data together. So you no longer need to use separate SQL and Search databases to manage tabular and non-tabular data.

High availability

Even if things go wrong in the data center, CrateDB continues to work. Automatic data replication across your cluster and rolling software updates help avoid hardware failures and scheduled maintenance without interrupting data access. In addition, CrateDB clusters are self-healing, so when nodes are added to the cluster, CrateDB automatically loads them with data.


Real-time data ingestion

Analytical data is often loaded in batches, transactional locks, and other overhead. In contrast, CrateDB eliminates overhead locks to provide bulk write performance (e.g., 40,000+ inserts per second per node on commodity hardware). In addition, CrateDB can provide query performance per millisecond, even when records are in action.


Any data and BLOBs

CrateDB supports both relational data and nested JSON documents. All nested JSON attributes can be included in any SQL command. CrateDB also provides BLOB storage, so you can store and retrieve BLOB files such as images, videos, or large unstructured files, providing a fully distributed BLOB storage cluster solution.


Time Series Analysis

Time series data is important for identifying trends and anomalies. CrateDB makes time series analysis fast and easy with automatic table partitions, which are like virtual tables that can be queried, moved or deleted. Partitioning the data by time interval ensures very fast time querying.


Geospatial queries

Location is important for many machine data analyses. For this reason, CrateDB can store and query geographic information using geo_point and geo_shape types. You can control the accuracy and resolution of the geographic index to get faster query results, and you can also perform accurate queries with scalar functions such as intersections, within and across distances.


Dynamic Patterns

Unlike many other SQL databases, CrateDB schemas are completely flexible. It's possible to add columns at any time without slowing down performance or downtime. This is great for flexible development and fast deployment.


Transaction Support

CrateDB is consistent but offers transactional semantics. CrateDB is consistent at the row level, so each row is either fully written or not. By offering post-write consistency, real-time synchronous access to single records immediately after they are written is allowed. Although CrateDB does not support ACID transactions with rollbacks, etc., it offers optimistic concurrency control by providing internal version control that allows write conflicts to be detected and resolved.


Backups

CrateDB can save incremental database snapshots for storage. The snapshots contain the state of the tables in the CrateDB cluster at the time the snapshot was created and can be restored to the cluster at any time.


Openness and flexibility

It is possible to run CrateDB anywhere, either in the data center or in the cloud.

Conclusion

Most CrateDB customers use it for operational analytics workloads, rapid time-series execution, text search, machine learning queries on data streams and inactive data in Industrial IoT, corporate cybersecurity and systems monitoring across industries, smart cities and building infrastructure, fleet tracking and management, and marketing analytics.

The open-source SQL database, Crate.io, has a built-in search engine for storing and analyzing machine learning data in real time. The company was founded in 2013 to provide developers with an open-source SQL database for collecting, analyzing and managing machine learning and artificial intelligence data.

If, while reading this, you have any questions about using this technology in your project, we can help you with the implementation. Our developers have innumerable experience with Azure, AWS DevOps as well as in database development technology.

If you're interested, book a time for a consultation at our Calendar or contact us via the email given in the contact information.

Let's build something great together!

Featured on our blog

Case studies

Our portfolio

Technology expertise

vuejs logoreact logo
laravel logodjango logosymfony logo
aws partners logo docker logo kafka logo
kubernetes logoatlassian logo