Waters Wrap: After Symphony, David Gurle’s new opus looks to distributed cloud storage, compute

The creator of Perzo/Symphony has started a new company called Hive. Anthony chats with David Gurle to see how the startup will look to disrupt the cloud space.

Say what you will about Symphony—and I’ve said (well, written) plenty—there’s no denying that the idea behind the company was an ambitious one. And while it was downright silly to call the startup a “Bloomberg Killer”, and while it may not have lived up to the promise that some saw in it, a $1.4 billion valuation (in 2019) of the company is proof that it’s a successful startup. And the company is still growing and evolving.

The brain behind the creation of Symphony is David Gurle, who created a tiny startup called Perzo out of Silicon Valley, the predecessor of Symphony.

Well, in 2018, Gurle and his wife began having conversations about work-life balance. This is a conversation that has happened with many technologists and their spouses or partners.

“Me and my wife were trying to reconcile Symphony in our lives,” Gurle told me recently. “It was very hard for me to continue that rhythm of work and have a balanced family life—it was impossible. I was traveling almost 260 days a year. We came to the conclusion that at some point I would have to step down.”

That conversation set in motion a plan for Gurle to leave Symphony in 2021—that plan came to fruition April last year. But he wasn’t ready to retire. As his next move, Gurle is taking aim at the worlds of distributed cloud storage and distributed cloud compute with the launch of Hive.

For my column today, I’ll do my best to break down Hive’s plans for cloud domination, such as it is. I don’t do PR—I bristle at that idea. But I do dig ambitious companies and people. Gurle’s track record, in my opinion, establishes his bona fides as someone to be taken seriously and, thus, makes Hive worthy of examination, even though its seed-funding round raised a modest €7 million ($7.3m)—a relative drop in the bucket of the $1.3 trillion cloud storage/compute markets. (And let it be noted now: this is not a financial services play, though two banks have agreed to give it a whirl. It will be open to companies of all stripes and sizes.)

[CORRECTIONThe original article said  €7 billion; it is actually million.]

At the end of the day, I won’t decide if Hive and Gurle are successful—the market will. If you’re interested, keep reading to see what Gurle is trying to tackle.

Building a hive

The inception for Hive came thanks to Symphony hitting something of a tipping point. It was around 2018 that Symphony was growing its user base and a $165 million funding round that would value the vendor at $1.4 billion was on the horizon.

The problem was that as Symphony grew, its hosting costs were exploding, Gurle recalls. (All quotes in this story are from Gurle. Establishing this allows me, the writer, the freedom to not have to constantly write, “Gurle says”, and yes, I’ll write in present tense since this conversation happened yesterday. I’ll write “Gurle says” when I paraphrase. The fourth wall has been broken.)

“At peak, peak usage, we were only using five percent of the capacity that we were paying for at Amazon and Google. One day, they came to me and said, ‘David, our storage costs are going through the roof.’ I said, ‘Why? It’s so cheap, how could this be?’ Well, we keep every bit of data—terabytes and terabytes and terabytes of data. When you add it up, it’s no longer $10 a month—it’s thousands of dollars a month. There had to be an alternative.”

The seed was planted.

The way Hive is tackling the idea of cloud computing is relatively easily understood at face value, but it gets complex when you dig into it. Rather than building datacenters in the manner of Amazon Web Services (AWS), Google Cloud, Microsoft Azure or IBM Cloud—to name the big ones weeding their way through financial services—Hive relies on incentivizing (and convincing) users to rent a portion of their computer storage/compute and share that with other users in Hive’s network. (No small feat, and this will determine if Hive is another Symphony, something bigger or a … I don’t know, fill in the name of a failed startup here.)

To start, the company will focus more on distributed storage and will spread into compute.

On storage, Hive has three independent—but it believes complementary—use cases. First, it will provide a “safe” to store things like cryptocurrency keys or important documents. Second, it will have file sharing capabilities—free to use—that will allow users to see who has accessed, opened, downloaded, and forwarded a file (“and prevent that, to the extent it’s possible”). Third, it will provide secure back-up services.

On the compute side—which, again, will come after storage—all these programs will run on a peer-to-peer (P2P) network. Eventually, Hive will release a systems toolkit (STK) that will allow users to build applications on the Hive network.

Hive has built this P2P network using two open-source protocols: BitTorrent, a decentralized communication protocol for P2P file sharing, and the InterPlanetary File System (IPFS), a P2P network for storing and sharing data in a distributed-file system.

Now, we’re about 800 words into this “column” about a new cloud option and we haven’t brought up security. AWS, Google, Microsoft, and IBM spend billions on this subject every year—Hive currently has nine full-stack engineers. This will be a major question that will be brought up every single time Hive enters a pitch meeting. We know this. Obviously, so does Gurle. You don’t start a company like Perzo/Symphony and not know about cybersecurity.

Gurle says that Hive will deploy the “far edge of encryption”, including homomorphic, zero-knowledge, and quantum encryptions. “Everything will be encrypted, even metadata. We’re going with the most state-of-the-art encryption that you could ever imagine.”

And, of course, if we’re going to talk encryption and anything distributed, the word “blockchain” will be on the table. (Again, I’m not a fan of the tech in the capital markets as has been currently sold.) But Gurle says that Hive is using blockchain—but with a twist.

“Unlike blockchain, which grows linearly, this will not. It will arrive to a point that it will auto-determine to be sufficient in terms of its performance. When it reaches that threshold of what I call the no-performance zone of blockchain, it will spawn a new blockchain on its own, automatically.”

The way Gurle describes it is when there are too many people/too much information on a particular blockchain—thus making it onerous for the computers participating in the network to provide “the right proof of trust”—the system will “spawn” a new blockchain. This is being done to help solve blockchain’s scaling/capacity issues. Gurle calls it “fractal blockchain”.

So, you have the open-source protocols, the peer-to-peer network, the “far edge” encryption tools, and blockchain … they come together to create the Hive network. That was the original thought, anyway, but as Gurle learned, something was missing.

“During the research phase, I discovered one of the biggest problems that we had is, how do we ensure network availability? One thing is to put the files into a network, but what if a portion of the network becomes unavailable?”

To solve this mission-critical problem, Hive will use error-correction technology that basically takes a signature of a shard (a chunk of a file), so that if a piece of the network dies, the system can reconstruct the file to a certain degree—right now, Gurle says they can reconstruct a lost file up to 50%. But—and, again, this is crucial to the company’s survival—as more nodes are added to the network, that percentage will be improved.

“As such, it’s important to have a critical mass from which, statistically speaking, only a very small portion of the network’s nodes will go down. I can’t tell you what that number is at this stage because we are in the process [of figuring out] that formula, but we will come up with that technology to provide that.”

While Hive is not financial-specific, the first two enterprise customers are large banks, Gurle says—one a based in Europe, the other in the US.

For banks, a prime use-case for Hive to solve is that of batch computing. Gurle provides an example: Monte Carlo risk calculations, which can take hours to run and require high-compute power. “For Hive, the golden nugget is that instead of using a dedicated datacenter to run those calculations, we can use our existing computers to run it.”

I’ll keep harping on this, but for Hive to succeed, it needs as clients a fair number of large companies based across the globe to allow for a follow-the-sun model. But Gurle says that smaller players—say a 20-person alt data shop that does 3D geographical mapping using drones—will prove valuable as they are “very more price and capacity sensitive, but redundancy friendly—as long as the cost of redundant storage isn’t expensive and fits their criteria, they are very open to using Hive for storage capacity.”

The cost

Which brings us to pricing: Why would a large enterprise want to provide its computing power as storage for others … potentially competitors?

A Hive user—they’re apparently called “Hivers” … sorry—can be a consumer, a producer, or both. If a company wants to produce, say, 50 gigabytes to the Hive network, in exchange it gets 50 gigs for free. If that company consumes less than what they contributed, they get paid.

I compared this to the carbon-offset model we see in ESG; Gurle agreed (sorry to switch to past-tense—had to): If you produce more and consume less, you get paid for that; if you consume more and produce less, you get charged for that. Hive takes a five percent commission from these transactions—that’s the startup’s business model.

Now, there’s still some discussion as to how exactly to implement this pricing strategy. One way is to use a bid/ask market model—as the spread comes together between a producer and a consumer, they agree, and Hive takes five percent. Or, Hive serves as a market maker, where the vendor buys the storage and then offers it via “a competitive pricing model,” with a five percent mark-up.

“They [the two banks] have thousands of computers on VMware or Citrix-based applications. From 5pm until 8am, they are idle, but that spare capacity is there and used. So they want us to help them to take advantage of those virtual desktops and pool those resources for us to run those programs. For them, that’d be a huge win. That’s the first thing they want us to do.”

What’s next?

As noted before, Hive recently announced a €7 million seed-funding round. Its current stable of nine database engineers has built the desktop interface; the company is currently in the process of hiring engineers to build out the mobile application. It is also looking to add a team of five in India, which will be followed by a team in Vietnam. By the end of the year, Hive will have 20 to 25 employees, Gurle says. (He also notes that Hive will have “no real headquarters”—in a post-Covid world, he’ll look for the best talent at the best prices.)

“We have a number of PhDs in our company, so you see we’ll push the edge of distributed computing to its real limits with the program that we’ve announced. It’s going to be [an] intensely engineering-oriented culture to begin with until we crack all the unknowns successfully.”

In August, Hive will conduct its first “test release”, which will be reserved to 1,000 users (in this case, nodes connecting to the Hive network) to prove the storage and secure backup use cases—“No bells and whistles.”

In October, they will look to bump it up to 50,000 users and it will include sharing capabilities, as well as “all the bells and whistles around [the] security framework and user interface will be ready for the October release.” But it will still be a controlled release.

In January 2023, Hive will provide a public release with no limit on users.

And in July 2023, Hive will issue its STK, which will lead into its Series-A round of funding.

“For me, road for success into the A round is that we’ve achieved the SLAs of this product, we have 50,000 daily active users—and that’s sustainable, meaning they are not dropping and are continuously using—and we’ve delivered on the STK, which will give us enough proof points for the next round of investors to come in and help us to scale this up to the next use cases and more and more marketing.”

He continues: “Our goal is to be as competitive as possible with existing [cloud] platforms.”

Let’s just hope no one refers to Hive as an “AWS killer”.

The image accompanying this column is “Straw Bee Hive” by Frank Gray, courtesy of the National Gallery of Art’s open-access program.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

Data catalog competition heats up as spending cools

Data catalogs represent a big step toward a shopping experience in the style of Amazon.com or iTunes for market data management and procurement. Here, we take a look at the key players in this space, old and new.

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here