Bitcoin
Bitcoin (BTC)
$97,575.00 0.8676
Bitcoin price
Ethereum
Ethereum (ETH)
$2,648.11 -0.45571
Ethereum price
BNB
BNB (BNB)
$605.49 -2.96197
BNB price
Solana
Solana (SOL)
$204.02 0.7836
Solana price
XRP
XRP (XRP)
$2.42 -0.91263
XRP price
Shiba Inu
Shiba Inu (SHIB)
$0.0000158 -1.59093
Shiba Inu price
Pepe
Pepe (PEPE)
$0.0000096 -1.11175
Pepe price
Bonk
Bonk (BONK)
$0.0000175 -1.85804
Bonk price
dogwifhat
dogwifhat (WIF)
$0.660549 -5.58776
dogwifhat price
Popcat
Popcat (POPCAT)
$0.310467 1.45323
Popcat price
Bitcoin
Bitcoin (BTC)
$97,575.00 0.8676
Bitcoin price
Ethereum
Ethereum (ETH)
$2,648.11 -0.45571
Ethereum price
BNB
BNB (BNB)
$605.49 -2.96197
BNB price
Solana
Solana (SOL)
$204.02 0.7836
Solana price
XRP
XRP (XRP)
$2.42 -0.91263
XRP price
Shiba Inu
Shiba Inu (SHIB)
$0.0000158 -1.59093
Shiba Inu price
Pepe
Pepe (PEPE)
$0.0000096 -1.11175
Pepe price
Bonk
Bonk (BONK)
$0.0000175 -1.85804
Bonk price
dogwifhat
dogwifhat (WIF)
$0.660549 -5.58776
dogwifhat price
Popcat
Popcat (POPCAT)
$0.310467 1.45323
Popcat price
Bitcoin
Bitcoin (BTC)
$97,575.00 0.8676
Bitcoin price
Ethereum
Ethereum (ETH)
$2,648.11 -0.45571
Ethereum price
BNB
BNB (BNB)
$605.49 -2.96197
BNB price
Solana
Solana (SOL)
$204.02 0.7836
Solana price
XRP
XRP (XRP)
$2.42 -0.91263
XRP price
Shiba Inu
Shiba Inu (SHIB)
$0.0000158 -1.59093
Shiba Inu price
Pepe
Pepe (PEPE)
$0.0000096 -1.11175
Pepe price
Bonk
Bonk (BONK)
$0.0000175 -1.85804
Bonk price
dogwifhat
dogwifhat (WIF)
$0.660549 -5.58776
dogwifhat price
Popcat
Popcat (POPCAT)
$0.310467 1.45323
Popcat price
Bitcoin
Bitcoin (BTC)
$97,575.00 0.8676
Bitcoin price
Ethereum
Ethereum (ETH)
$2,648.11 -0.45571
Ethereum price
BNB
BNB (BNB)
$605.49 -2.96197
BNB price
Solana
Solana (SOL)
$204.02 0.7836
Solana price
XRP
XRP (XRP)
$2.42 -0.91263
XRP price
Shiba Inu
Shiba Inu (SHIB)
$0.0000158 -1.59093
Shiba Inu price
Pepe
Pepe (PEPE)
$0.0000096 -1.11175
Pepe price
Bonk
Bonk (BONK)
$0.0000175 -1.85804
Bonk price
dogwifhat
dogwifhat (WIF)
$0.660549 -5.58776
dogwifhat price
Popcat
Popcat (POPCAT)
$0.310467 1.45323
Popcat price

Mind the data gap: DeAI requires more diverse datasets | Opinion

Opinion
Mind the data gap: DeAI requires more diverse datasets | Opinion

Disclosure: The views and opinions expressed here belong solely to the author and do not represent the views and opinions of crypto.news’ editorial.

Artificial intelligence is all the rage. Yet beneath the hype surrounding decentralized AI (DeAI) lies a critical flaw: a dearth of diverse, secure, verifiable data. On-chain datasets are simply too limited to train truly powerful models. This risks ceding the AI future to centralized behemoths, which have unfettered access to the vast data troves of the web.

DeAI’s promise—democratized, transparent, and robust AI—hinges on bridging this data gap. Clever cryptography offers a route.

The beauty of conventional AI lies in its gluttony. The more data it devours, the smarter it becomes. But this advantage is also its Achilles’ heel. Centralized AI models are trained on data often harvested without explicit consent, raising thorny questions of privacy and control.

DeAI, built on blockchain’s principles of decentralization and transparency, offers an appealing alternative. Yet, most data onchain comes from financial transactions or DeFi. Small language models especially require more precise data for fine-tuning. This leaves DeAI models starved of the rich and varied datasets needed to refine them to the competitive levels expected of the latest models.

Such datasets are available outside web3, with The Pile and Common Crawl each containing data from billions of unique sources. The depth of existing verified web2 data sources, as much as the volume of data, is what has enabled centralized AI providers to refine their GPTs as far and as fast as they have.

Recreating the same level of data onchain is not feasible on a competitive timescale. And while some AI firms have run afoul of data creators who accuse them of stealing exactly the type of nuanced data discussed here, there is another way to get more data onchain—make it safer.

Building bridges

This is where cryptography comes in. Zero-knowledge proofs, already making waves in blockchain scalability and privacy, offer a potent solution. Two techniques in particular—zero-knowledge fully homomorphic encryption (zkFHE) and zero-knowledge TLS (zkTLS)—hold the key to unlocking web2’s data for DeAI.

zkFHE allows computations to be performed on encrypted data without decrypting it. Imagine training an AI model on sensitive medical records without ever exposing the raw patient data. This is the power of zkFHE. It allows DeAI models to learn from vast, privacy-protected datasets, vastly expanding their training possibilities.

zkTLS extends this principle to internet communication. It allows users to prove possession of certain data from a website—say, a credit score or social media activity—without revealing the underlying information. This is crucial for integrating the wealth of data residing in web2’s silos into DeAI systems. For instance, a decentralized credit scoring model could leverage zkTLS to access authenticated financial data from traditional institutions without compromising their confidentiality.

Advantage, DeAI?

The implications are profound. By combining zkFHE and zkTLS, DeAI can tap into the vastness of web2’s data while preserving the core tenets of privacy and decentralization. This could level the playing field, allowing DeAI to compete with and perhaps even surpass centralized AI.

Consider the development of large language models currently dominated by well-funded tech giants. These models require colossal amounts of text data for training. By leveraging zkTLS, DeAI developers could access and utilize publicly available web data in a privacy-preserving manner, creating more democratic and transparent LLMs.

There are, of course, challenges. Implementing zkFHE and zkTLS is computationally intensive, requiring significant advances in hardware and software. Standardization and interoperability are also crucial for widespread adoption. But the potential rewards are immense.

In the race for AI supremacy, data is the ultimate fuel. By embracing cryptographic solutions like zkFHE and zkTLS, DeAI can access the fuel it needs to perform. This is not just about building smarter AI; it’s about building a more democratic and equitable AI future.

Xiang Xie
Xiang Xie

Xiang Xie is the CEO and co-founder of Primus. He devoted much of his career to cryptography, spanning from theoretical research to practical implementation, both in academic and industrial settings. His focus has been on privacy-preserving machine learning using multiparty computation and zero-knowledge proofs to safeguard user data and model privacy.