Exploring Decentralized Databases WeaveDB, Polybase and Others

The Web3 ecosystem has a need for efficient, secure, and interoperable data storage solutions that can power web3 apps as seamlessly as web2 databases but in a permissionless manner. In this article, we will explore some promising decentralized databases — WeaveDB, Polybase, and Others — looking at their key features, benefits, and applications.
Additionally, we will delve into the differences between classic finite-state and infinite-state DLTs (Distributed Ledger Technology), that help power seamless web3 applications via decentralized databases.
Finite-State vs. Infinite-State DLTs
In comparing Finite-State and Infinite-State DLTs (Distributed Ledger Technology), we discuss general characteristics, not chain specifics. We’ll later examine ecosystem maturity for application development on Infinite-State DLTs.
Finite-State DLTs like EVM layer 1’s, normally have limited, predefined data structures within smart contracts, that only keep enough state data for light storage applications, this global state stays constant with each transaction maintained by all nodes in the network.
They are usually less scalable and can struggle with increased demand, where they maintain decentralisation and security, leading to slower transaction processing and higher fees. Using smart contract storage for a database can be expensive, 1Mb storage on Eth can cost up to $10k, this is unfeasible for decentralized social media, IOT, or data-intensive applications native to web3 and indexing doesn’t come included so you need to utilise a service like The Graph.
However, they can be easier to manage and require less complex data structures or side chains that may introduce additional security challenges. Currently, typical storage light applications like DeFi, securely manage account balances, and smart contract states, and are well-tested for account transactions, where a node in the network maintains a copy of the entire state, and transactions update this state.
For financial applications, users may opt for the tested reliability and security offered by more mature Finite-State DLTs and then utilise layer 2s for scaling.
Infinite-State DLTs differ as they have an expandable, dynamic data structure that could handle large sets of data, this requires indexing and storing increasing amounts of data.
These types of decentralised databases normally split storage and compute layers. The storage layers typically utilise Arweave, IPFS or Filecoin for unstructured data file storage. Then implement different architectures to achieve NoSql or Relational databases on top of decentralised networks.
Many infinite-state DLT solutions are still in the experimental or early adoption stages. As a result, they are more complex, utilising side chains, bundling, sequencers, zk-rollups or sharding. The complexities involved in developing, maintaining, and securing these types of chains could lead to adjustments in consensus mechanisms or the node validation process. They may not yet have been battle tested for their maturity, security, or stability like established finite-state DLTs;
However, Infinite-state DLTs are rapidly advancing and show great potential in integrating new technologies, such as Arweave storage with lazy SmartWeave contracts and warp evaluating node networks, or off-chain ZK-SQL compute that can be rolled up and validated on-chain. These innovations enable more complex, scalable applications and offer decentralized databases for permissionless use.
Decentralised databases
WeaveDB
WeaveDB aims to be a Firestore on smart contracts, offering a user and developer experience reminiscent of Web2 applications.
Currently in preseed, recently raised $900k (Jan 2023), It combines the ease-of-use and scalability of traditional NoSQL databases with the benefits of smart contracts, utilising Warp smart contracts on the SmartWeave platform, which is built on the Arweave network. They offer optimized execution, adaptive scaling, and modular design, making them ideal for scalable, and cost-effective web3 apps.
It’s currently very cost effective as the Bundlr network is currently subsidising transaction costs under 100b, covering most database transactions, unless you’re storing blob data.
The decentralised database, works in tandem with other chains, so you can still sign into your web3 application with a popular EVM wallet like Metamask. It offers a seamless web2 experience as transactions are auto-signed by a disposal key without wallet popups as long as they are under 100b.
WeaveDB allows for simple and efficient data storage, data indexing is built into the smart contract, so querying, and data retrieval is fast while maintaining security and decentralization.
They are worth checking out here, as there are other features I haven’t gone into details on here and their documentation is very good, with great examples. I have also used weaveDb to build a sample application.
Coming soon: Sample application blog post.
Recent announcement partnerships with Lit protocol for encrypted data & lens handle authentication here & a deep dive on the overall architecture here.
Polybase
Polybase aims to be a Web3 Replacement for Firebase, Postgres, and Supabase.
Polybase is currently in alpha and recently raised $2M. I have used this to build a web3 loyalty app at work and it performed well for POC.
Polybase is utilising Zk as it’s difficult to build decentralized databases on-chain. In the previous example, WeaveDb gets there by using composing bundling transactions, SmartWeave Lazy evaluation consensus and cheap on-chain Arweave storage.
Polybase ZK approach enables data storage off-chain. It then creates zero-knowledge proofs, which are cryptographic confirmations of the data and its permissions without revealing the data itself. These proofs are then added to Layer 1 (L1) blockchains. As a result, this method enjoys the security and decentralization benefits of L1 blockchains while also improving user experience and cost-efficiency by storing data off-chain.
Polybase has this concept of pluggable modules that allows it to be agnostic to L1s and also storage layers. You can select the appropriate storage layer like arweave for storing the indexes or aggregated data. I believe (but have not validated yet) this could also enable you to keep enterprise secure data that cannot be encrypted on-chain within a private network and provide tamper-proof verification publicly, which is a nice feature.
Polybase has a util available in the SDK for encrypting data privacy and access control on-chain, while still providing the scalability and ease-of-use developers expect from traditional database solutions.
One thing to note is this is early stage and not currently decentralised but will be when they roll out their beta. I’ve had a play around building a loyalty application and it was very easy to get up and running and the team are responsive on the discord channel. No Explorer like weaveDb and glacier but roll-ups will be done in 1sec and then viewable on-chain, so very efficient for most use-cases.
Further reading on Polybase you can take a look at their whitepaper & blog
Some other interesting Decentralised databases that I’ve been looking at but not had the chance to build with yet;
Glacier Network: Resilient and Immutable NoSQL Database
This is a fairly new product, even in the fast-paced growth of decentralised databases and another sitting on top of Arweave as the data store and also leveraging zk-rollups to facilitate large-scale on-chain data validation.
At first glance, it looks to be a bit of a combo between Polybase and WeaveDb, but that’s a naive comment and I will need to spend some time looking into it. Their whitepaper gives some more information about their market position and comparison guide.
FirstBatch: Offer two types of decentralised database HollowDB: Key-value DB & DANNY a decentralized vector database.
See here for HollowDB
DANNY has caught my attention for its potential applications in semantic search and recommendation engines. These engines could search through vast amounts of unstructured data stored on Arweave, thereby maintaining transparency and decentralization. See a sample application for personalized content at this here.
Chia Data Layer: Chia DataLayer, is a general-purpose decentralized database.
Chia Data Layer enables users to publish tables of data that others can subscribe to. While the data itself is not stored on the blockchain, proofs of the data and URLs to fetch it are, ensuring data accuracy and immutability. This method allows the data layer to be flexible for relational data or document data and then indexing using SQL or NoSql standards.
One notable application of the Chia DataLayer is in the World Bank’s Climate Warehouse project, which aims to enhance transparency in carbon markets. Each participant in the project publishes data on their DataLayer tables, which are run on their own infrastructure.
More info here
Space and time: decentralized replacement for your blockchain indexing service, database, data warehouse, and API servers
Space and Time is a decentralized platform designed to replace traditional blockchain services such as indexing, database management, data warehousing, and API servers. It enables real-time access to data from major blockchain networks, integrates on-chain and off-chain data, and supports secure, tamperproof analytics.
Space and Time offers a “Proof of SQL” feature. This connects data and analytics directly to smart contracts with cryptographic guarantees, enabling smart contracts to execute tamperproof SQL queries. It also allows enterprises to combine public on-chain data with sensitive, private off-chain data they load in, ensuring the privacy and security of data.
For more information check out their website here.
Conclusion
As the Web3 ecosystem continues to evolve, the need for efficient, secure, and scalable decentralized databases becomes increasingly important. WeaveDB, Polybase & others each offer unique advantages and features that cater to different aspects of dApp development. By understanding the differences between finite-state and infinite-state DLTs, developers can make informed decisions about the underlying technology that best suits their application’s needs. With these innovative solutions, the future of decentralized databases and dApps looks more promising than ever.
As I’ve been writing this Messari has released a report on the overview of decentralised databases which has a good overview slide. I’m unable to read the report as it’s enterprise customers only but the tweet is useful.
6/ Decentralized databases provide verifiable/tamperproof data, reliability/availability, & censorship resistance.
Users have complete control over their data, enabling easy migration & monetization, reducing risk of exploitation by centralized entities. pic.twitter.com/Sv8TMAR7av
— Messari (@MessariCrypto) May 12, 2023
Others to consider
-
Orbit
-
Kwil
-
DeSo
-
Lens (Not exactly a database but worth looking at for social data)
-
Ceramic
-
DB3
-
TableLand
Please let me know if anything here is not correct and I will do my best to update it. These ramblings are my experience looking into decentralised databases for personal projects.