Clickhouse is an unfastened root analytics-oriented database strategy composed of 1.5 cardinal lines of code, mostly successful C++, a notoriously unsafe connection successful position of hiding bugs that could beryllium exploited by malicious attackers.
Often it has been written, even present astatine The New Stack, that nan Rust programming language could switch C/C++ pinch its superior handling of representation and thread safety. And location are galore ample codification bases, specified arsenic nan Linux kernel (C) and Windows (C++) written pinch immoderate decades-old version language, truthful galore are asking themselves nan aforesaid questions.
The maintainers of Clickhouse started down that path, pinch nan purpose of converting nan functionality of Clickhouse written successful C++ code. And possibly moreover rewrite nan full codification guidelines itself.
The operational mobility was, “if we started today, would we constitute Clickhouse successful Rust?” asked Alexey Milovidov, CTO and cofounder of Clickhouse, who discussed nan results successful a talk astatine ScylaDB‘s virtual P99 Conf.
In nan end, nan halfway developers took a much incremental way to migration, Milovidov explained. They first integrated Rust into nan build system, past they built retired modules for various functionalities.
Along nan way, they encountered galore challenges, including ensuring reproducible builds and managing dependencies.
“Rust whitethorn beryllium perfect, but erstwhile you usage C+ and Rust together, it could beryllium problematic,” Milovidov advised.
The Price of C++ Is a Humongous Build System
Writing a mission-critical app successful C++ still comes pinch galore advantages, arsenic Milovidov pointed out. “It is good established. It is good recognized. It is rather popular. It is easy to prosecute group pinch C++ knowledge. Universities still thatch C,” he said.
But utilizing C++ requires “so galore efforts,” he lamented. It is “almost inevitable” that you will tally into information issues astir representation corruptions, segmentation faults, aliases title conditions, he added.
In fact, Clickhouse ended up building a “gargantuan” CMake-based continuous integration system conscionable to guarantee each these types of bugs were caught and fixed.
From an mean of 70 propulsion requests and 145 commits a day, nan CI strategy produces “10s of billions of tests,” which is really “10s of millions” of individual tests done successful varying combinations — each to guarantee nan caller codification doesn’t travel pinch immoderate caller bugs.

Is C++ a pain? Milovidov had a full descent connected nan subject…
The Rust Journey
“So possibly it’s clip to rewrite successful Rust,” nan halfway dev squad wondered. The connection offered some representation and thread safety. It besides offered much libraries, particularly astir nan emerging information standards specified arsenic Apache Iceberg. And nan connection seemed to beryllium attracting each nan young, eager package engineers.
Yet, a afloat rewrite of Clickhouse into Rust would return years.
Instead, nan squad decided connected an iterative approach, wherever various pieces of nan Clickhouse strategy could beryllium redone successful Rust. They’d usage Corrosion to merge pinch CMake.
First, they added a mini usability to SQL, 1 for BLAKE3 hashing, written successful Rust and wrapped for C++. Then they augmented nan bid statement interface, clickhouse-client, pinch amended history and navigation, acknowledgment to an extracurricular contributor. They besides accepted a propulsion petition for an replacement to SQL, a room called nan PRQL (Pipelined Relational Query Language), written successful Rust.

With their assurance successful Rust growing, nan projects sewage larger. The adjacent Rust trial was to merge a Rust-based room for nan emerging Delta Lake format, the Delta-kernel-rs library. Here was a lawsuit wherever a room is disposable successful Rust earlier 1 would beryllium available, if ever, successful C++.
Clickhouse could person written a room in-house, successful C++ , for Delta Lake, but nan activity would person been “pointless,” Milovidov said. Pity nan mediocre programmer who would walk clip penning codification for parsing JSON files and redirecting HTTP requests. It was conscionable easier to usage nan charismatic Databricks Rust-based release.
Dangers With Rust and C++
Through these experiments, nan Clickhouse devs learned astir immoderate shortcomings pinch Rustlang, particularly erstwhile utilized successful conjunction pinch C++.
One situation was reproducible builds, which are basal to guarantee nan codification is safe, and not accidentally downloaded from nan net somewhere. Clickhouse had gone done nan process of ensuring reproducible builds successful C++, but pinch Rust, they had to deliberation done nan process of really to guarantee hermetic builds again.
Writing C++ wrappers for Rust programs is besides a challenge. Sussing retired whether to allocate representation successful C++ aliases Rust tin beryllium tricky. Fuzzing tools and Clickhouse’s CI strategy thief find errors a batch here.
There were differences successful really each connection performed nether duress.
Compared to C++, Rust programs and libraries thin to panic excessively overmuch for Milovidov’s liking. The panics whitethorn beryllium owed to a bug (indicating that amended testing was needed of nan library). Or nan code’s writer utilized nan panic termination successful spot of calling an exception (which tends to be frowned upon by Rustaceans).
Panic is cool pinch batch jobs, but very overmuch little cool pinch server and interactive applications that are moving unrecorded successful existent time.
“Libraries successful Rust thin to usage Panic excessively much, moreover erstwhile it is not appropriate, and we person to find and hole each these cases conscionable to debar abrupt termination of our server application,” Milovidov said.

Rust codification shouldn’t panic truthful much, Milovidov asserted.
And conscionable arsenic nan Clickhouse squad has recovered bugs that needed fixing successful C++ libraries, truthful excessively person they recovered plentifulness of bugs successful Rust libraries arsenic well.
Milovidov besides delved nan into galore peculiarities that travel astir specifically by intermingling C++ and Rust successful nan aforesaid environment, pinch circumstantial issues astir utilizing sanitization, managing cross-dependencies, and cross-compilations, and codification composability for nan developer. And galore of these issues stem from nan complexity of nan required build system.
One different unexpected broadside effect of moving to Rust: More dependencies. For nan full codebase, Clickhouse had only 156 dependencies. When nan Rust modules were brought in, they recovered themselves managing an further 672 transitive Rust limitations (Still not arsenic overmuch arsenic they would person pinch NPM, Milovidov quipped).
Clickhouse’s Takeaway
For now, anyway, Clickhouse has decided against rewriting nan full database strategy successful Rust. But it is assured capable pinch nan connection that it is allowing third-party contributors to taxable their ain Clickhouse add-ons successful nan language.
“Rust is simply a awesome language,” Milovidov said.
To perceive nan entire presentation, motion up for nan P99 Conf.
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don't miss an episode. Subscribe to our YouTube channel to watercourse each our podcasts, interviews, demos, and more.
Group Created pinch Sketch.
English (US) ·
Indonesian (ID) ·