Technical Details

How We Ensure Fair and Accurate RPC Performance Comparisons with Enhanced Reliability

🔬

Research & Educational Tool

This demo is designed for research and educational purposes to demonstrate RPC performance comparison methodologies. Results may vary based on network conditions and should not be considered definitive benchmarks. Bugs may occur as this is an experimental tool.

🎯 Testing Methodology Overview

Our enhanced RPC performance testing system provides the most fair and accurate comparison possible between different Ethereum RPC providers. We test Direct against industry leadersAlchemy, Infura, and QuickNode using scientifically rigorous methods with advanced cache-busting, reliability monitoring, and comprehensive statistical analysis.

Key Principles

  • Maximum Fairness: All providers tested under identical conditions with cache busting
  • True Parallelism: Simultaneous execution with randomized order to eliminate bias
  • Statistical Rigor: 50 calls with comprehensive metrics (median, 95th percentile, max)
  • Persistent Storage: Results saved locally for historical comparison
  • Transparency: Open methodology with detailed fairness and reliability metrics

⚖️ Enhanced Reliability & Fairness Approach

We use an advanced testing methodology with cache-busting, reliability monitoring, and comprehensive statistical analysis to ensure maximum fairness and eliminate potential biases that could skew performance comparisons.

1. 🔥 Connection Pre-Warming

Before each test, we perform dummy calls to all providers to establish connections and ensure they start from equal footing.

await Promise.allSettled([
  directClient?.getChainId(),
  alchemyClient.getChainId(),
  infuraClient.getChainId(),
  quicknodeClient.getChainId()
]);

2. 🎲 Randomized Execution Order

For each test, we randomize the order in which providers are executed using the Fisher-Yates shuffle algorithm to prevent systematic bias.

Why this matters: If Direct always went first, it might have an advantage from being the first to establish connections or hit cached data.

3. ⚡ Shared Timing Reference

All providers are started simultaneously using Promise.allSettled() with a single global start time reference for precise timing measurements.

const globalStartTime = performance.now();
const results = await Promise.allSettled(testPromises);

4. 🚫 Cache Busting & Reliability

We implement multiple cache-busting techniques to ensure accurate measurements and detect potentially cached responses.

Techniques: Randomized test addresses, no-cache headers, varied block numbers, and suspicious timing detection (<0.1ms responses).

5. 📊 Fairness Verification

We measure the timing delta between when each provider actually starts executing and log warnings if fairness is compromised.

Fairness Score: We calculate a 0-100 fairness score where 100 is perfect. Timing deltas <1ms = excellent fairness, >5ms = potential bias warning.

🧪 Test Types & Complexity

We test 5 different types of blockchain operations to simulate real-world usage patterns, from simple balance checks to complex DeFi interactions.

Simple Tests

💰 Balance Check

Tests basic eth_getBalance RPC call performance

Target: Vitalik's address (0xd8dA...6045)
Method: getBalance()

📦 Block Number

Tests eth_blockNumber to get latest block number

Target: Latest block number
Method: getBlockNumber()

Medium & Complex Tests

🪙 WETH Contract

Tests contract interaction with eth_call

Contract: WETH on Sepolia
Method: totalSupply()

🦄 Uniswap V3

Tests complex DeFi interactions with multiple parallel calls

Calls: Factory owner, fee spacing, position manager
Complexity: 3 parallel contract calls

🏦 Aave Protocol

Tests lending protocol interactions

Calls: Pool address, price oracle
Complexity: 2 parallel contract calls

📈 Statistical Analysis

Our comprehensive testing approach ensures statistically significant results through large sample sizes and rigorous analysis.

Sample Size

  • 50 total calls per test run
  • 10 iterations of each test type
  • 5 test types covering different complexities
  • 4 providers tested simultaneously

Detailed Metrics

  • Speed multipliers (how much faster winner is)
  • Win/loss ratios for each provider
  • Mean, median, 95th percentile, max response times
  • Standard deviation for consistency analysis
  • Fairness scores and timing deltas
  • Cache hit detection and reliability metrics

Why 50 Calls?

This sample size provides meaningful performance insights while being fast enough for interactive testing. Results are automatically saved to your browser's local database for comparison across multiple test runs.

🏆 Live Performance Score Methodology

Our live performance scoring system provides real-time insights into overall RPC provider performance by calculating comprehensive averages across all completed test runs.

📊 Average Response Time Analysis

Instead of just counting individual test wins, we calculate the average response time for each provider across all completed calls to determine overall performance.

Example: If Direct averages 150ms and Alchemy averages 600ms across all tests, Direct is consistently 4x faster regardless of individual test outcomes.

🥇 Three-Way Performance Comparison

We provide three key performance metrics for comprehensive analysis:

  • vs average: Main metric comparing fastest to average of all other providers
  • vs slowest: Shows maximum performance advantage
  • vs 2nd place: Compares fastest to the second-fastest provider

📈 Outlier Impact & Statistical Robustness

Our average-based approach captures the full performance picture, including outliers that affect real-world user experience.

Why averages matter: A provider might win 90% of individual tests but have occasional slow responses (500ms+) that significantly impact the user experience. Our methodology captures this real-world variability.

⚡ Real-Time Performance Tracking

The live performance score updates in real-time as tests complete, providing immediate feedback on provider performance trends.

Live insights: Watch how performance metrics evolve as more data is collected, revealing consistent patterns and temporary fluctuations.

🎯 Why This Approach is More Accurate

Traditional "win counting" can be misleading when one provider wins by 1ms while losing by 500ms. Our average-based scoring reflects actual user experience by weighing all response times equally, providing a more realistic view of consistent performance across varied network conditions.

🔍 Transparency & Verification

We believe in complete transparency. Every aspect of our testing methodology is open for inspection and verification.

🎲 Execution Order Logging

Each test logs the randomized execution order so you can verify that no provider consistently gets an advantage from execution position.

⏱️ Timing Delta Monitoring

We measure and display the timing difference between when each provider starts executing. Values <1ms indicate excellent fairness.

📊 Detailed Results

Every single call result is stored and can be analyzed, including success/failure rates, exact timing measurements, and error details.

🔬 Open Source Methodology

Our testing code is open for inspection. You can see exactly how we ensure fairness and calculate performance metrics.

💾 Local Data Storage

All test results are automatically saved to your browser's IndexedDB with complete metadata including timestamps, fairness scores, and statistical breakdowns. Data stays on your device and is never sent to external servers.

🔍 Reliability Monitoring

We actively monitor for suspicious results including responses faster than 0.1ms (potential cache hits), excessive speed multipliers (>50x), and timing inconsistencies.

🌐 Network Conditions & Environment

Test Environment

  • Network: Ethereum Sepolia Testnet
  • Client: Browser-based (Viem.js v2.37.13)
  • Direct: @direct.dev/wagmi v0.6.3
  • Protocol: HTTPS/HTTP2 with no-cache headers
  • Location: User's browser location
  • Storage: IndexedDB for persistent results

Controlled Variables

  • Same network: All providers use Sepolia testnet
  • Same client: Identical Viem.js configuration with cache busting
  • Same timing: Simultaneous execution with randomized order
  • Varied data: Rotated addresses and block numbers to prevent caching
  • Same conditions: Pre-warmed connections for fair starting points

⚠️ Real-World Variability

Results may vary based on your geographic location, internet connection, and current network conditions. Our enhanced methodology accounts for this variability through statistical analysis, cache busting, and reliability monitoring. All results are saved locally for comparison across different conditions and time periods.

📊 Comprehensive Statistical Analysis

Our testing system calculates detailed statistical metrics to provide comprehensive performance insights beyond simple averages.

Core Metrics

  • Mean: Average response time across all calls
  • Median: Middle value (more robust than mean)
  • 95th Percentile: 95% of calls were faster than this time (best-case indicator)
  • Minimum: Fastest single response time recorded
  • Standard Deviation: Consistency measurement

Performance & Reliability Metrics

  • Overall Speed vs 2nd Place: Main performance metric based on average response times
  • Overall Speed vs Slowest: Maximum performance advantage across all providers
  • Overall Speed vs Average: Performance compared to the mean of all providers
  • Fairness Score: 0-100 scale measuring test quality (timing consistency)
  • Cache Hit Rate: % of suspiciously fast responses (<0.1ms, likely cached)
  • Timing Delta: Maximum start time difference between providers
  • Execution Order: Randomized provider sequence for each test

Why These Metrics Matter

Overall Speed vs 2nd Place is our primary metric showing real-world performance advantage.Median is more reliable than mean for individual test analysis.95th percentile shows best-case performance (95% of calls were faster than this).Standard deviation indicates consistency - lower values mean more predictable performance.Cache Hit Rate of 0.0% means all calls were genuine network requests (no caching bias). Our average-based scoring captures the full user experience including outliers.

💾 Local Storage & Historical Analysis

Every test run is automatically saved to your browser's local database, enabling historical comparison and trend analysis over time.

🗄️ IndexedDB Storage

We use IndexedDB (a robust browser database) to store complete test results including individual call timings, statistical summaries, fairness scores, and metadata. This enables offline access and long-term performance tracking.

📈 Historical Comparison

Compare performance across different time periods, network conditions, and Direct versions. Track how RPC performance changes over time and identify patterns in provider reliability.

🔒 Privacy First

All data stays on your device. No test results, timing data, or performance metrics are ever transmitted to external servers. You have complete control over your testing data.

⚠️ Limitations & Considerations

While we strive for maximum fairness, there are some inherent limitations to browser-based RPC testing that users should be aware of.

Browser Environment

Tests run in your browser's JavaScript environment, which has inherent limitations compared to server-side testing. However, this reflects real-world dApp usage.

Understanding Extreme Speed Differences

When you see Direct responding in 0.5-3ms while others take 100-300ms, this typically indicates:
Connection optimization: Direct's enhanced connection handling
Geographic proximity: Direct nodes closer to your location
Network routing: More efficient routing paths
Protocol optimization: Enhanced HTTP/2 multiplexing and compression

Network Variability

Internet conditions, geographic location, and ISP routing can affect results. Multiple test runs may show different winners based on current conditions.

Caching Effects

RPC providers may cache responses differently, which can affect performance. Our pre-warming helps minimize this, but some caching effects may remain.

Statistical Analysis

Our 50-call sample size provides meaningful insights while enabling rapid testing. We calculate comprehensive statistics including median (more robust than mean), 95th percentile (worst-case performance), and standard deviation (consistency). Results are automatically saved for historical comparison and trend analysis.