Web3 Backend Infrastructure: Nodes, RPCs, and Indexers Explained for 2026
Web3 backend infrastructure powers every DApp, from wallet interactions to complex DeFi dashboards. This guide covers node providers, RPC endpoints, blockchain indexers, and data APIs -- the invisible layer that determines your application's performance, reliability, and cost.
Web3 backend infrastructure is the critical middleware layer that connects your decentralized application to blockchain networks. It consists of three core components: nodes (the blockchain software that validates and stores the chain state), RPC endpoints (the API interfaces that let your application read from and write to the blockchain), and indexers (the data processing systems that transform raw blockchain data into queryable, application-specific databases). Without robust backend infrastructure, your DApp cannot read wallet balances, submit transactions, display historical data, or respond to on-chain events. In 2026, the Web3 infrastructure market is valued at over $3 billion annually, with providers like Alchemy, Infura, QuickNode, and The Graph processing billions of API calls per day across 50+ blockchain networks. Understanding these components is essential for any team building on-chain applications, whether you are building a simple wallet, a complex DeFi dashboard, or a real-time trading platform.
Web3 Backend Infrastructure: Nodes, RPCs, and Indexers Explained for 2026
Web3 backend infrastructure powers every DApp, from wallet interactions to complex DeFi dashboards. This guide covers node providers, RPC endpoints, blockchain indexers, and data APIs -- the invisible layer that determines your application's performance, reliability, and cost.
Web3 backend infrastructure is the critical middleware layer that connects your decentralized application to blockchain networks. It consists of three core components: nodes (the blockchain software that validates and stores the chain state), RPC endpoints (the API interfaces that let your application read from and write to the blockchain), and indexers (the data processing systems that transform raw blockchain data into queryable, application-specific databases). Without robust backend infrastructure, your DApp cannot read wallet balances, submit transactions, display historical data, or respond to on-chain events. In 2026, the Web3 infrastructure market is valued at over $3 billion annually, with providers like Alchemy, Infura, QuickNode, and The Graph processing billions of API calls per day across 50+ blockchain networks. Understanding these components is essential for any team building on-chain applications, whether you are building a simple wallet, a complex DeFi dashboard, or a real-time trading platform.
The infrastructure layer is often invisible to end users but determines the quality of their experience. A DApp that takes 3 seconds to load a wallet balance because of slow RPC calls will lose users. A DeFi dashboard that shows stale portfolio data because of poor indexing will destroy trust. A trading bot that misses opportunities because of unreliable WebSocket connections will cost money. Getting the infrastructure right is as important as getting the smart contracts right.
Understanding the Web3 Infrastructure Stack
The Three Pillars
Your DApp Frontend
|
v
[RPC Provider / Node] ---- Read/write blockchain state
|
v
[Indexer / Data API] ----- Query historical & aggregated data
|
v
[Blockchain Network] ----- Source of truth (Ethereum, Solana, etc.)
Pillar 1: Nodes -- Software clients (Geth, Erigon, Reth for Ethereum; Agave, Firedancer for Solana) that participate in the blockchain network, validate transactions, and maintain the chain state.
Pillar 2: RPC Endpoints -- JSON-RPC or REST API interfaces that expose node functionality to applications. Methods like eth_getBalance, eth_sendRawTransaction, and eth_getLogs are the fundamental building blocks of Web3 interactions.
Pillar 3: Indexers -- Systems that process raw blockchain events and transaction data, transforming it into structured, queryable databases optimized for specific application needs.
Blockchain Nodes: The Foundation Layer
What Is a Blockchain Node?
A node is a computer running blockchain client software that:
•Connects to other nodes in the peer-to-peer network
•Validates new transactions and blocks
•Stores blockchain state and history
•Responds to data queries via RPC interface
Node Types
Node Type
Storage
Sync Time
Use Case
Cost (Ethereum)
Full node
~1 TB
2-7 days
Standard operations
$200-$500/month (cloud)
Archive node
~15 TB+
2-4 weeks
Historical queries
$1,000-$3,000/month (cloud)
Light node
~1 GB
Minutes
Mobile/resource-constrained
Minimal
Validator node
~1 TB + 32 ETH
2-7 days
Network validation + staking
$200-$500/month + 32 ETH
Full nodes store the complete current state but prune historical state data. They can answer questions about the current blockchain state but cannot look up historical storage values at arbitrary block heights.
Archive nodes store every historical state, enabling queries like "what was this address's balance at block 15,000,000?" They are essential for block explorers, analytics platforms, and DeFi dashboards that show historical data.
Key insight for builders: Most DApps do not need to run their own nodes. Node-as-a-service providers (Alchemy, Infura, QuickNode) offer the same functionality at lower cost and higher reliability than self-hosted nodes, unless you have specific requirements for customization, data privacy, or latency.
Ethereum Client Software
Client
Language
Type
Market Share
Notes
Geth
Go
Execution
~55%
The original, most battle-tested
Erigon
Go
Execution
~10%
Optimized for archive nodes, lower storage
Nethermind
C#
Execution
~10%
Enterprise-focused, good for Gnosis Chain
Besu
Java
Execution
~5%
Enterprise, Hyperledger project
Reth
Rust
Execution
~15% (growing)
Paradigm-backed, fast sync, modern architecture
Prysm
Go
Consensus
~35%
Most popular consensus client
Lighthouse
Rust
Consensus
~30%
Sigma Prime, high performance
Teku
Java
Consensus
~15%
Consensys, enterprise-grade
Nimbus
Nim
Consensus
~10%
Status, resource-efficient
Lodestar
TypeScript
Consensus
~5%
ChainSafe, unique language choice
Client diversity matters: If one client has a bug, it only affects a subset of the network. The Ethereum community actively promotes client diversity to prevent single points of failure. Running minority clients (Nethermind + Lodestar, for example) contributes to network health and may earn you social capital in the ecosystem.
Solana Node Infrastructure
Solana nodes have significantly higher hardware requirements:
Requirement
Minimum
Recommended
CPU
12+ cores, 2.8+ GHz
16+ cores, 3.0+ GHz
RAM
128 GB
256 GB
Storage
2 TB NVMe SSD
4+ TB NVMe SSD
Bandwidth
1 Gbps
10 Gbps
Monthly cost (bare metal)
$500-$800
$800-$1,500
Monthly cost (cloud)
$1,500-$3,000
$3,000-$5,000+
This high hardware bar is a direct consequence of Solana's monolithic, high-throughput architecture. It is also why most Solana DApp builders use RPC providers rather than running their own nodes.
RPC Providers: The API Layer
What Is an RPC Provider?
RPC (Remote Procedure Call) providers run blockchain nodes and expose them as API endpoints that your application can call. Instead of running your own node, you send requests to the provider's infrastructure.
Common Ethereum RPC methods:
Method
Purpose
Example Response
eth_blockNumber
Get latest block number
"0x1234567"
eth_getBalance
Get address ETH balance
"0x1bc16d674ec80000" (2 ETH)
eth_getTransactionReceipt
Get transaction details
Receipt object with logs
eth_call
Read-only contract call
Encoded return data
eth_sendRawTransaction
Submit signed transaction
Transaction hash
eth_getLogs
Get filtered event logs
Array of log objects
eth_estimateGas
Estimate gas for a transaction
Gas amount
eth_subscribe
WebSocket subscription
Stream of events
Major RPC Providers Compared
Provider
Chains Supported
Free Tier
Paid Plans
Unique Features
Alchemy
30+
300M CU/month
$49-$399+/month
Enhanced APIs, webhooks, Notify
Infura
20+
100K requests/day
$50-$1,000+/month
MetaMask default, IPFS gateway
QuickNode
50+
10M API credits/month
$49-$299+/month
Marketplace add-ons, streams
Chainstack
30+
3M requests/month
$29-$349+/month
Elastic nodes, dedicated nodes
Ankr
45+
30M requests/month
$49-$499+/month
Decentralized RPC network
Helius
Solana only
100K credits/day
$49-$499+/month
DAS API, webhooks, Priority Fee API
Triton
Solana only
N/A
Custom pricing
Dedicated infrastructure, lowest latency
Blast API (Bware Labs)
50+
12M API calls/month
$12-$100+/month
Decentralized, multi-region
Choosing an RPC Provider: Decision Framework
Factor
Questions to Ask
Impact
Chain support
Does the provider support all chains you deploy on?
Eliminates 50%+ of options
Reliability (uptime SLA)
What is the guaranteed uptime? Is there a refund policy?
Critical for production DApps
Latency
What is the p50/p99 response time? Where are their nodes located?
Critical for trading/MEV apps
Rate limits
What are the requests/second and daily limits?
Determines if free tier is sufficient
Archive node access
Do they provide archive data for historical queries?
Required for analytics, block explorers
Enhanced APIs
Do they offer token balance APIs, NFT APIs, transaction history?
Reduces indexing needs
WebSocket support
Do they support eth_subscribe and stable WebSocket connections?
Required for real-time apps
Pricing model
Compute units vs. flat rate vs. per-request? How does scaling work?
Determines cost predictability
Debug/trace support
Do they support debug_traceTransaction and trace_block?
Required for debugging and analytics
Self-Hosted vs. Provider: Cost Analysis
Scenario
Self-Hosted (Ethereum Archive)
RPC Provider (Alchemy Growth)
Monthly cost
$1,500-$3,000 (cloud)
$399/month
Setup time
2-4 weeks (sync + config)
5 minutes
Maintenance
4-8 hours/month (updates, monitoring)
Zero
Uptime
99.5-99.9% (without redundancy)
99.99% (SLA-backed)
Multi-chain
Separate node per chain
Single account, all chains
Archive data
Full control
Included in plan
Custom methods
Full flexibility
Limited to provider's API
Recommendation for most teams: Use a managed RPC provider for development and early production. Consider self-hosted nodes only when you need custom RPC methods, data privacy, or have reached a scale where provider costs exceed self-hosting costs (typically >100M requests/month).
Raw blockchain data is not application-friendly. To answer a simple question like "show me all NFTs owned by this address," you would need to scan every block, every transaction, and every event log from the beginning of the chain -- billions of operations. Indexers solve this by pre-processing blockchain data into structured databases that your application can query efficiently.
Without an indexer:
Query: "Get all Uniswap V3 swaps for USDC/ETH in the last 24 hours"
Process: Scan ~7,200 blocks, filter ~50,000 transactions, decode ~10,000 event logs
Time: 30-120 seconds
Cost: Millions of RPC calls
With an indexer:
Query: "Get all Uniswap V3 swaps for USDC/ETH in the last 24 hours"
Process: Single GraphQL/SQL query against indexed database
Time: 50-200 milliseconds
Cost: One API call
Major Indexing Solutions Compared
Solution
Approach
Query Language
Chains
Pricing
Best For
The Graph
Subgraphs (AssemblyScript)
GraphQL
40+
Free (decentralized) or hosted
DeFi protocols, broad chain support
Goldsky
Mirror pipelines + subgraphs
GraphQL + SQL
30+
$0-$500+/month
Real-time data, custom pipelines
Ponder
TypeScript framework
GraphQL
EVM chains
Self-hosted (free)
Developer experience, type safety
Envio
HyperIndex (Rust/TS)
GraphQL
30+
$0-$300+/month
Fast sync, real-time processing
Subsquid
Archives + processors
GraphQL
100+
$0-$500+/month
Multi-chain, raw data access
Dune Analytics
SQL-based analytics
SQL (DuneSQL)
20+
$0-$399+/month
Analytics dashboards, community
Covalent (GoldRush)
Unified API
REST/GraphQL
200+
$0-$499+/month
Multi-chain data, unified schema
Moralis
Streams + API
REST
20+
$0-$299+/month
Rapid development, Web2 developers
The Graph: Deep Dive
The Graph is the most widely used decentralized indexing protocol. It processes over 1 billion queries per month and indexes data for protocols like Uniswap, Aave, Compound, and Synthetix.
How The Graph works:
•Define a subgraph: Write a schema (GraphQL), data sources (contract addresses and ABIs), and mapping handlers (AssemblyScript functions that process events)
•Deploy: Publish your subgraph to The Graph Network (decentralized) or hosted service
•Index: Indexer nodes process historical and new blockchain data according to your mappings
•Query: Your application queries the subgraph via GraphQL
•Batch requests: Use JSON-RPC batching to combine multiple calls into a single HTTP request
•Multicall contracts: Use Multicall3 to batch on-chain reads into a single RPC call (reading 100 token balances in one call instead of 100 separate calls)
•Caching: Cache RPC responses in Redis with appropriate TTLs (block data: cache until new block; static data: cache indefinitely)
•Connection pooling: Reuse HTTP/WebSocket connections to reduce latency overhead
2. Indexer Performance
•Selective indexing: Only index the events and contracts your application needs
•Batch processing: Process events in batches rather than one at a time
•Read replicas: Use PostgreSQL read replicas to scale read-heavy workloads
•Materialized views: Pre-compute complex aggregations for frequently accessed data
3. Frontend Optimization
•Optimistic updates: Update the UI immediately on transaction submission, then reconcile with on-chain state
•Progressive loading: Load critical data first (balances, positions), then secondary data (history, analytics)
•WebSocket for real-time: Use WebSocket subscriptions for price feeds and transaction confirmations instead of polling
•Smart caching: Cache static data (token metadata, contract ABIs) aggressively; cache dynamic data (balances, prices) with short TTLs
Cost Optimization Strategies
1. RPC Cost Reduction
Strategy
Savings
Implementation
Response caching (Redis)
40-60% fewer RPC calls
Cache balances, prices with TTL
Multicall batching
80-95% fewer calls
Use Multicall3 for read operations
Websocket subscriptions
50-70% vs polling
Replace polling with subscriptions for real-time data
Provider tiering
20-40% cost reduction
Use free tier for non-critical calls, paid tier for production
Request deduplication
10-30% fewer calls
Deduplicate identical concurrent requests
2. Indexing Cost Reduction
Strategy
Savings
Implementation
Selective event indexing
50-80% less data processed
Only index events your app actually queries
Ponder (self-hosted)
100% vs hosted indexing fees
Run alongside your app, PostgreSQL backend
Shared subgraphs
Split costs across multiple apps
Use community-deployed subgraphs for common protocols
Time-bounded queries
30-50% less data stored
Only store recent data, archive older data to cold storage
Security Considerations for Web3 Infrastructure
RPC Security
•Never expose RPC endpoints to the client: Proxy all RPC calls through your backend to prevent API key leakage and rate limit abuse
•Use separate keys for frontend and backend: Different rate limits, monitoring, and access controls
•Implement request validation: Validate that incoming RPC requests are well-formed and within expected parameters
•Monitor for anomalies: Track RPC call patterns and alert on unusual spikes (may indicate an attack)
•Use HTTPS exclusively: Never send transactions over unencrypted connections
Indexer Security
•Validate indexed data: Cross-reference critical indexed data with direct RPC calls to prevent data poisoning
•Implement access controls: Not all indexed data should be publicly queryable
•Monitor indexer health: Alert if indexing falls behind the chain head by more than a few blocks
•Handle chain reorganizations: Your indexer must correctly handle reorgs (reverted blocks) to prevent displaying incorrect data
The Future of Web3 Infrastructure (2026 and Beyond)
Emerging Trends
•
Decentralized RPC networks: Projects like Lava Network, Pocket Network, and dRPC are building decentralized alternatives to centralized RPC providers, offering censorship resistance and geographic distribution.
•
Real-time indexing: The gap between block production and indexed data availability is shrinking. Solutions like Goldsky Streams and Envio HyperIndex offer sub-second indexing for many use cases.
•
ZK-proven data: Zero-knowledge proofs are being applied to historical blockchain data, enabling trustless verification without re-executing transactions. Axiom and Herodotus are leading this space.
•
Account abstraction infrastructure: ERC-4337 bundlers, paymasters, and entry point contracts are creating a new infrastructure layer for smart account wallets.
•
Modular data availability: As Ethereum's rollup-centric roadmap matures, data availability layers (Celestia, EigenDA, Avail) are becoming critical infrastructure components that DApp builders need to understand.
•
AI-powered infrastructure: Machine learning is being applied to RPC load balancing, MEV protection, and anomaly detection in blockchain data.
Frequently Asked Questions
Do I need to run my own blockchain node?
For most DApp builders, no. Managed RPC providers (Alchemy, Infura, QuickNode) offer better uptime, lower cost, and zero maintenance compared to self-hosted nodes. Consider self-hosting only if you need custom RPC methods, extreme data privacy, or process more than 100 million requests per month where self-hosting becomes cost-effective.
What is the difference between an RPC provider and an indexer?
An RPC provider gives you real-time access to the current blockchain state -- like checking a balance or submitting a transaction. An indexer processes historical blockchain data into structured databases, enabling complex queries like "show me all swaps on Uniswap in the last 24 hours sorted by volume." RPC calls are for real-time operations; indexers are for historical queries and aggregations.
How much does Web3 backend infrastructure cost?
For an early-stage DApp, $0-$100 per month using free tiers of RPC providers and enhanced APIs. For a growth-stage application, $100-$1,000 per month adding custom indexing (The Graph or Ponder) and caching (Redis). For a production DeFi protocol, $2,000-$10,000+ per month with dedicated RPC tiers, custom indexing pipelines, monitoring, and automation.
Which indexing solution should I choose?
The Graph is the default choice for broad chain support and a decentralized data layer. Ponder is best for TypeScript developers who want local development with hot reloading. Goldsky and Envio are best for real-time data requirements. For complex, multi-source analytics, a custom indexing pipeline with PostgreSQL or ClickHouse provides the most flexibility.
How do I handle multiple blockchain networks?
Use a single RPC provider that supports all your target chains (Alchemy, QuickNode support 30-50+ chains). For indexing, deploy chain-specific subgraphs or use a unified API like Covalent that normalizes data across chains. Build a chain abstraction layer in your backend that routes requests to the appropriate chain-specific services.
What is the best RPC provider for Solana?
Helius and Triton are the top Solana-specific providers. Helius offers enhanced APIs (DAS API for NFTs, priority fee API, webhooks) with generous free tiers. Triton provides dedicated infrastructure for applications requiring the lowest latency. General providers like Alchemy and QuickNode also support Solana but with less specialized features.
How do I ensure my infrastructure is reliable?
Use multiple RPC providers with automatic failover (primary + backup). Implement health checks that verify response accuracy, not just availability. Set up monitoring and alerting for response times, error rates, and indexing lag. For critical applications, deploy infrastructure across multiple geographic regions and cloud providers.
What is a subgraph and how do I build one?
A subgraph is a custom data indexing specification for The Graph protocol. It defines which smart contract events to index and how to transform them into queryable entities. You write a GraphQL schema (data model), a manifest (which contracts and events to index), and mapping handlers (AssemblyScript functions that process events). The Graph Network then indexes your data and serves it via a GraphQL API.
The infrastructure layer is often invisible to end users but determines the quality of their experience. A DApp that takes 3 seconds to load a wallet balance because of slow RPC calls will lose users. A DeFi dashboard that shows stale portfolio data because of poor indexing will destroy trust. A trading bot that misses opportunities because of unreliable WebSocket connections will cost money. Getting the infrastructure right is as important as getting the smart contracts right.
Understanding the Web3 Infrastructure Stack
The Three Pillars
Your DApp Frontend
|
v
[RPC Provider / Node] ---- Read/write blockchain state
|
v
[Indexer / Data API] ----- Query historical & aggregated data
|
v
[Blockchain Network] ----- Source of truth (Ethereum, Solana, etc.)
Pillar 1: Nodes -- Software clients (Geth, Erigon, Reth for Ethereum; Agave, Firedancer for Solana) that participate in the blockchain network, validate transactions, and maintain the chain state.
Pillar 2: RPC Endpoints -- JSON-RPC or REST API interfaces that expose node functionality to applications. Methods like eth_getBalance, eth_sendRawTransaction, and eth_getLogs are the fundamental building blocks of Web3 interactions.
Pillar 3: Indexers -- Systems that process raw blockchain events and transaction data, transforming it into structured, queryable databases optimized for specific application needs.
Blockchain Nodes: The Foundation Layer
What Is a Blockchain Node?
A node is a computer running blockchain client software that:
•Connects to other nodes in the peer-to-peer network
•Validates new transactions and blocks
•Stores blockchain state and history
•Responds to data queries via RPC interface
Node Types
Node Type
Storage
Sync Time
Use Case
Cost (Ethereum)
Full node
~1 TB
2-7 days
Standard operations
$200-$500/month (cloud)
Archive node
~15 TB+
2-4 weeks
Historical queries
$1,000-$3,000/month (cloud)
Light node
~1 GB
Minutes
Mobile/resource-constrained
Minimal
Validator node
~1 TB + 32 ETH
2-7 days
Network validation + staking
$200-$500/month + 32 ETH
Full nodes store the complete current state but prune historical state data. They can answer questions about the current blockchain state but cannot look up historical storage values at arbitrary block heights.
Archive nodes store every historical state, enabling queries like "what was this address's balance at block 15,000,000?" They are essential for block explorers, analytics platforms, and DeFi dashboards that show historical data.
Key insight for builders: Most DApps do not need to run their own nodes. Node-as-a-service providers (Alchemy, Infura, QuickNode) offer the same functionality at lower cost and higher reliability than self-hosted nodes, unless you have specific requirements for customization, data privacy, or latency.
Ethereum Client Software
Client
Language
Type
Market Share
Notes
Geth
Go
Execution
~55%
The original, most battle-tested
Erigon
Go
Execution
~10%
Optimized for archive nodes, lower storage
Nethermind
C#
Execution
~10%
Enterprise-focused, good for Gnosis Chain
Besu
Java
Execution
~5%
Enterprise, Hyperledger project
Reth
Rust
Execution
~15% (growing)
Paradigm-backed, fast sync, modern architecture
Prysm
Go
Consensus
~35%
Most popular consensus client
Lighthouse
Rust
Consensus
~30%
Sigma Prime, high performance
Teku
Java
Consensus
~15%
Consensys, enterprise-grade
Nimbus
Nim
Consensus
~10%
Status, resource-efficient
Lodestar
TypeScript
Consensus
~5%
ChainSafe, unique language choice
Client diversity matters: If one client has a bug, it only affects a subset of the network. The Ethereum community actively promotes client diversity to prevent single points of failure. Running minority clients (Nethermind + Lodestar, for example) contributes to network health and may earn you social capital in the ecosystem.
Solana Node Infrastructure
Solana nodes have significantly higher hardware requirements:
Requirement
Minimum
Recommended
CPU
12+ cores, 2.8+ GHz
16+ cores, 3.0+ GHz
RAM
128 GB
256 GB
Storage
2 TB NVMe SSD
4+ TB NVMe SSD
Bandwidth
1 Gbps
10 Gbps
Monthly cost (bare metal)
$500-$800
$800-$1,500
Monthly cost (cloud)
$1,500-$3,000
$3,000-$5,000+
This high hardware bar is a direct consequence of Solana's monolithic, high-throughput architecture. It is also why most Solana DApp builders use RPC providers rather than running their own nodes.
RPC Providers: The API Layer
What Is an RPC Provider?
RPC (Remote Procedure Call) providers run blockchain nodes and expose them as API endpoints that your application can call. Instead of running your own node, you send requests to the provider's infrastructure.
Common Ethereum RPC methods:
Method
Purpose
Example Response
eth_blockNumber
Get latest block number
"0x1234567"
eth_getBalance
Get address ETH balance
"0x1bc16d674ec80000" (2 ETH)
eth_getTransactionReceipt
Get transaction details
Receipt object with logs
eth_call
Read-only contract call
Encoded return data
eth_sendRawTransaction
Submit signed transaction
Transaction hash
eth_getLogs
Get filtered event logs
Array of log objects
eth_estimateGas
Estimate gas for a transaction
Gas amount
eth_subscribe
WebSocket subscription
Stream of events
Major RPC Providers Compared
Provider
Chains Supported
Free Tier
Paid Plans
Unique Features
Alchemy
30+
300M CU/month
$49-$399+/month
Enhanced APIs, webhooks, Notify
Infura
20+
100K requests/day
$50-$1,000+/month
MetaMask default, IPFS gateway
QuickNode
50+
10M API credits/month
$49-$299+/month
Marketplace add-ons, streams
Chainstack
30+
3M requests/month
$29-$349+/month
Elastic nodes, dedicated nodes
Ankr
45+
30M requests/month
$49-$499+/month
Decentralized RPC network
Helius
Solana only
100K credits/day
$49-$499+/month
DAS API, webhooks, Priority Fee API
Triton
Solana only
N/A
Custom pricing
Dedicated infrastructure, lowest latency
Blast API (Bware Labs)
50+
12M API calls/month
$12-$100+/month
Decentralized, multi-region
Choosing an RPC Provider: Decision Framework
Factor
Questions to Ask
Impact
Chain support
Does the provider support all chains you deploy on?
Eliminates 50%+ of options
Reliability (uptime SLA)
What is the guaranteed uptime? Is there a refund policy?
Critical for production DApps
Latency
What is the p50/p99 response time? Where are their nodes located?
Critical for trading/MEV apps
Rate limits
What are the requests/second and daily limits?
Determines if free tier is sufficient
Archive node access
Do they provide archive data for historical queries?
Required for analytics, block explorers
Enhanced APIs
Do they offer token balance APIs, NFT APIs, transaction history?
Reduces indexing needs
WebSocket support
Do they support eth_subscribe and stable WebSocket connections?
Required for real-time apps
Pricing model
Compute units vs. flat rate vs. per-request? How does scaling work?
Determines cost predictability
Debug/trace support
Do they support debug_traceTransaction and trace_block?
Required for debugging and analytics
Self-Hosted vs. Provider: Cost Analysis
Scenario
Self-Hosted (Ethereum Archive)
RPC Provider (Alchemy Growth)
Monthly cost
$1,500-$3,000 (cloud)
$399/month
Setup time
2-4 weeks (sync + config)
5 minutes
Maintenance
4-8 hours/month (updates, monitoring)
Zero
Uptime
99.5-99.9% (without redundancy)
99.99% (SLA-backed)
Multi-chain
Separate node per chain
Single account, all chains
Archive data
Full control
Included in plan
Custom methods
Full flexibility
Limited to provider's API
Recommendation for most teams: Use a managed RPC provider for development and early production. Consider self-hosted nodes only when you need custom RPC methods, data privacy, or have reached a scale where provider costs exceed self-hosting costs (typically >100M requests/month).
Raw blockchain data is not application-friendly. To answer a simple question like "show me all NFTs owned by this address," you would need to scan every block, every transaction, and every event log from the beginning of the chain -- billions of operations. Indexers solve this by pre-processing blockchain data into structured databases that your application can query efficiently.
Without an indexer:
Query: "Get all Uniswap V3 swaps for USDC/ETH in the last 24 hours"
Process: Scan ~7,200 blocks, filter ~50,000 transactions, decode ~10,000 event logs
Time: 30-120 seconds
Cost: Millions of RPC calls
With an indexer:
Query: "Get all Uniswap V3 swaps for USDC/ETH in the last 24 hours"
Process: Single GraphQL/SQL query against indexed database
Time: 50-200 milliseconds
Cost: One API call
Major Indexing Solutions Compared
Solution
Approach
Query Language
Chains
Pricing
Best For
The Graph
Subgraphs (AssemblyScript)
GraphQL
40+
Free (decentralized) or hosted
DeFi protocols, broad chain support
Goldsky
Mirror pipelines + subgraphs
GraphQL + SQL
30+
$0-$500+/month
Real-time data, custom pipelines
Ponder
TypeScript framework
GraphQL
EVM chains
Self-hosted (free)
Developer experience, type safety
Envio
HyperIndex (Rust/TS)
GraphQL
30+
$0-$300+/month
Fast sync, real-time processing
Subsquid
Archives + processors
GraphQL
100+
$0-$500+/month
Multi-chain, raw data access
Dune Analytics
SQL-based analytics
SQL (DuneSQL)
20+
$0-$399+/month
Analytics dashboards, community
Covalent (GoldRush)
Unified API
REST/GraphQL
200+
$0-$499+/month
Multi-chain data, unified schema
Moralis
Streams + API
REST
20+
$0-$299+/month
Rapid development, Web2 developers
The Graph: Deep Dive
The Graph is the most widely used decentralized indexing protocol. It processes over 1 billion queries per month and indexes data for protocols like Uniswap, Aave, Compound, and Synthetix.
How The Graph works:
•Define a subgraph: Write a schema (GraphQL), data sources (contract addresses and ABIs), and mapping handlers (AssemblyScript functions that process events)
•Deploy: Publish your subgraph to The Graph Network (decentralized) or hosted service
•Index: Indexer nodes process historical and new blockchain data according to your mappings
•Query: Your application queries the subgraph via GraphQL
•Batch requests: Use JSON-RPC batching to combine multiple calls into a single HTTP request
•Multicall contracts: Use Multicall3 to batch on-chain reads into a single RPC call (reading 100 token balances in one call instead of 100 separate calls)
•Caching: Cache RPC responses in Redis with appropriate TTLs (block data: cache until new block; static data: cache indefinitely)
•Connection pooling: Reuse HTTP/WebSocket connections to reduce latency overhead
2. Indexer Performance
•Selective indexing: Only index the events and contracts your application needs
•Batch processing: Process events in batches rather than one at a time
•Read replicas: Use PostgreSQL read replicas to scale read-heavy workloads
•Materialized views: Pre-compute complex aggregations for frequently accessed data
3. Frontend Optimization
•Optimistic updates: Update the UI immediately on transaction submission, then reconcile with on-chain state
•Progressive loading: Load critical data first (balances, positions), then secondary data (history, analytics)
•WebSocket for real-time: Use WebSocket subscriptions for price feeds and transaction confirmations instead of polling
•Smart caching: Cache static data (token metadata, contract ABIs) aggressively; cache dynamic data (balances, prices) with short TTLs
Cost Optimization Strategies
1. RPC Cost Reduction
Strategy
Savings
Implementation
Response caching (Redis)
40-60% fewer RPC calls
Cache balances, prices with TTL
Multicall batching
80-95% fewer calls
Use Multicall3 for read operations
Websocket subscriptions
50-70% vs polling
Replace polling with subscriptions for real-time data
Provider tiering
20-40% cost reduction
Use free tier for non-critical calls, paid tier for production
Request deduplication
10-30% fewer calls
Deduplicate identical concurrent requests
2. Indexing Cost Reduction
Strategy
Savings
Implementation
Selective event indexing
50-80% less data processed
Only index events your app actually queries
Ponder (self-hosted)
100% vs hosted indexing fees
Run alongside your app, PostgreSQL backend
Shared subgraphs
Split costs across multiple apps
Use community-deployed subgraphs for common protocols
Time-bounded queries
30-50% less data stored
Only store recent data, archive older data to cold storage
Security Considerations for Web3 Infrastructure
RPC Security
•Never expose RPC endpoints to the client: Proxy all RPC calls through your backend to prevent API key leakage and rate limit abuse
•Use separate keys for frontend and backend: Different rate limits, monitoring, and access controls
•Implement request validation: Validate that incoming RPC requests are well-formed and within expected parameters
•Monitor for anomalies: Track RPC call patterns and alert on unusual spikes (may indicate an attack)
•Use HTTPS exclusively: Never send transactions over unencrypted connections
Indexer Security
•Validate indexed data: Cross-reference critical indexed data with direct RPC calls to prevent data poisoning
•Implement access controls: Not all indexed data should be publicly queryable
•Monitor indexer health: Alert if indexing falls behind the chain head by more than a few blocks
•Handle chain reorganizations: Your indexer must correctly handle reorgs (reverted blocks) to prevent displaying incorrect data
The Future of Web3 Infrastructure (2026 and Beyond)
Emerging Trends
•
Decentralized RPC networks: Projects like Lava Network, Pocket Network, and dRPC are building decentralized alternatives to centralized RPC providers, offering censorship resistance and geographic distribution.
•
Real-time indexing: The gap between block production and indexed data availability is shrinking. Solutions like Goldsky Streams and Envio HyperIndex offer sub-second indexing for many use cases.
•
ZK-proven data: Zero-knowledge proofs are being applied to historical blockchain data, enabling trustless verification without re-executing transactions. Axiom and Herodotus are leading this space.
•
Account abstraction infrastructure: ERC-4337 bundlers, paymasters, and entry point contracts are creating a new infrastructure layer for smart account wallets.
•
Modular data availability: As Ethereum's rollup-centric roadmap matures, data availability layers (Celestia, EigenDA, Avail) are becoming critical infrastructure components that DApp builders need to understand.
•
AI-powered infrastructure: Machine learning is being applied to RPC load balancing, MEV protection, and anomaly detection in blockchain data.
Frequently Asked Questions
Do I need to run my own blockchain node?
For most DApp builders, no. Managed RPC providers (Alchemy, Infura, QuickNode) offer better uptime, lower cost, and zero maintenance compared to self-hosted nodes. Consider self-hosting only if you need custom RPC methods, extreme data privacy, or process more than 100 million requests per month where self-hosting becomes cost-effective.
What is the difference between an RPC provider and an indexer?
An RPC provider gives you real-time access to the current blockchain state -- like checking a balance or submitting a transaction. An indexer processes historical blockchain data into structured databases, enabling complex queries like "show me all swaps on Uniswap in the last 24 hours sorted by volume." RPC calls are for real-time operations; indexers are for historical queries and aggregations.
How much does Web3 backend infrastructure cost?
For an early-stage DApp, $0-$100 per month using free tiers of RPC providers and enhanced APIs. For a growth-stage application, $100-$1,000 per month adding custom indexing (The Graph or Ponder) and caching (Redis). For a production DeFi protocol, $2,000-$10,000+ per month with dedicated RPC tiers, custom indexing pipelines, monitoring, and automation.
Which indexing solution should I choose?
The Graph is the default choice for broad chain support and a decentralized data layer. Ponder is best for TypeScript developers who want local development with hot reloading. Goldsky and Envio are best for real-time data requirements. For complex, multi-source analytics, a custom indexing pipeline with PostgreSQL or ClickHouse provides the most flexibility.
How do I handle multiple blockchain networks?
Use a single RPC provider that supports all your target chains (Alchemy, QuickNode support 30-50+ chains). For indexing, deploy chain-specific subgraphs or use a unified API like Covalent that normalizes data across chains. Build a chain abstraction layer in your backend that routes requests to the appropriate chain-specific services.
What is the best RPC provider for Solana?
Helius and Triton are the top Solana-specific providers. Helius offers enhanced APIs (DAS API for NFTs, priority fee API, webhooks) with generous free tiers. Triton provides dedicated infrastructure for applications requiring the lowest latency. General providers like Alchemy and QuickNode also support Solana but with less specialized features.
How do I ensure my infrastructure is reliable?
Use multiple RPC providers with automatic failover (primary + backup). Implement health checks that verify response accuracy, not just availability. Set up monitoring and alerting for response times, error rates, and indexing lag. For critical applications, deploy infrastructure across multiple geographic regions and cloud providers.
What is a subgraph and how do I build one?
A subgraph is a custom data indexing specification for The Graph protocol. It defines which smart contract events to index and how to transform them into queryable entities. You write a GraphQL schema (data model), a manifest (which contracts and events to index), and mapping handlers (AssemblyScript functions that process events). The Graph Network then indexes your data and serves it via a GraphQL API.