TL;DR
- Stake-weighted quality of service (SWQoS) prioritises transactions by sender's stake, protecting the network from spam while ensuring reliable delivery for validators
- Born in response to the network outage in April 2022, SWQoS was a necessary safety net to prevent bots from overwhelming leaders and halting the network.
- The original 80/20 split reserved most connection slots for staked traffic, often leaving that capacity idle while unstaked nodes fought for the remaining 20%
- Staked senders were also throttled by TPS quotas proportional to their stake, even when the TPU was nearly empty
- High-latency connections were penalised by fixed receive windows that didn't account for round-trip time
- Already live in 3.1, unstaked connection slots quadrupled from 500 to 2,000 (PR #9289), while staked slots stay at 2,000 in a separate pool, bringing the total to 4,000
- Agave 4.0 removes TPS throttling for staked senders entirely under normal load (PR #9580), with a 95% capacity safeguard that re-enables stake-proportional allocation during congestion
- Agave 4.0 also adds RTT-aware scaling (PR #10144), so high-latency connections are no longer penalised by fixed buffer sizes
- PR #10666 (targeting Agave 4.1) replaces sleep-based throttling with QUIC-level flow control. All staked senders will get ~10,000 TPS per connection, and only during congestion, allocation reverts to stake-proportional
- The end goal is to let all senders, both staked and unstaked, send as many transactions as they want under normal load
- Until then, sending through Triton gives all customers SWQoS as a default fallback
- This spring, we're moving all customers' transaction traffic to Jet, our high-performance sending engine
- All Triton endpoints will get stake-weighted quality of service by default at no extra cost
Introduction
If you are building anything that sends transactions on Solana, you have likely faced a trade-off when it comes to SWQoS. You either peered with a high-stake validator to guarantee priority transaction delivery, or you accepted the reliability issues of unstaked lanes.
For the last two years, this system operated on a static partition model. While it did stop the network from collapsing under spam, it imposed new, rigid limits that resulted in inefficiencies.
We are writing this today because the SWQoS mechanism is shifting into a dynamic, congestion-aware system as part of the upcoming Agave 4.0 release, with Agave 4.1 bringing even bigger changes to how the protocol handles ingress.
The rest of this article breaks down the upcoming SWQoS changes in the Agave client and what they mean for trading firms, infrastructure providers, and application developers.
How SWQoS came to exist
Let's first look at what stake-weighted quality of service (SWQoS) is and why it was introduced in the first place.
Early networking constraints
In early 2022, Solana's networking layer relied heavily on raw UDP (User Datagram Protocol). UDP is fast but connectionless, with no handshakes, established sessions, inherent flow control, or identity verification for senders, which meant the network had very limited tools for dealing with abusive, bursty, excessive, and always anonymous traffic.
These limitations became clear during the April 2022 outage, when bots attempting to win NFT mints flooded the leader's Transaction Processing Unit (TPU) with almost 6 million transactions per second. Validators couldn't distinguish bot traffic from normal users and, with no way to filter it, a single legitimate transaction had the same chance of landing as one of 50,000 duplicates.
Naturally, it made the experience much worse for normal users. The network wasn't "down", but the noise level was so high that votes couldn't reliably get through, and the entire cluster ultimately halted.
The introduction of QUIC and stake-weighting
The first step was to change how traffic reached the leader. Solana moved the transport layer to QUIC, which added handshakes, sessions and flow control. Instead of just firing packets at the leader, senders now had to establish a connection that could be tracked, rate-limited and tied to a sender identity.
With that in place, Solana could address which traffic reached the TPU and in what share. The guiding logic was "skin in the game": 2,000 connections (80% of capacity) were allocated to staked nodes in proportion to their stake, while the remaining 500 were reserved for "anonymous" unstaked traffic. Whenever more than the allocated amount of connections were used, older connections would be closed ("evicted") to allow new parties to connect.
This made it much harder for a malicious actor with zero stake to spin up thousands of clients and drown out honest participants.
It did, however, create a new problem. Most transactions were forwarded by RPC nodes with random identities, which had no way to leverage SWQoS without holding stake. One well-known solution is peering: validators allocate part of their stake budget to specific RPC identities, letting those RPCs use a priority lane as if they held that stake. But in practice, premium providers like Triton moved away from peering in favour of dedicated transaction sending software and services like Cascade, for higher speed and reliability.
At the same time, Solana rolled out fee-based execution priority, but that's a story for another blog post.
Limitations of the static 80/20 model
While the static SWQoS implementation protected the network, it introduced bottlenecks that impacted user experience. The 80/20 connection allocation stayed fixed regardless of real-time usage, driving frequent evictions of unstaked connections and repeatedly imposing the (newly added) latency cost of performing two-way QUIC handshakes.
Reserved capacity vs actual usage In the previous versions (Agave v3.0 and prior), 80% of capacity was reserved strictly for staked validators. Even if those high-stake validators were not sending transactions, the barriers remained up.
As a result, the 500 public connections were effectively gridlocked at all times, causing constant evictions and delayed, or even entirely failed transaction landing, while most of the network's ingress capacity sat idle to preserve a theoretical safety limit.
Connection churn. The combination of 500 unstaked connection slots and a limit of 8 connections per IP address introduced a different bottleneck: handshake churn. With this configuration, just 62 entities (500 / 8) could occupy all unstaked slots, leading to the following loop:
User connects → validator's 500 slots are full → validator evicts User → User immediately retries → validator evicts someone else to let User in.
Instead of forwarding transactions, RPC nodes were expending resources on repeated QUIC handshakes, which, from a developer's perspective, looked like random timeouts and increased latency on transaction submission.
Latency penalty. The validator's QUIC stream limits were fixed at values that assumed ~50 ms round-trip time. A sender with higher latency, say connecting from Europe to a US-based leader, would hit the stream ceiling sooner simply because their packets took longer to arrive, effectively reducing their throughput below what their stake or allocation would warrant.
Minimum stake threshold effect. To prevent low-stake validators from abusing staked connections and receiving disproportionate bandwidth, the protocol enforces a minimum stake threshold: to be considered "staked", a validator needs to hold ~1/50,000th of total network stake.
Under the static model, dropping even slightly below that line meant you were treated exactly like a zero-stake node and thrown back into the same public pool of 500 connections.
This created a steep barrier to entry for smaller operators trying to run high-performance infrastructure, and raised serious concerns about power concentrating around a small set of large, well-funded players.
Changes in Agave: rebalancing protection and ingress
Agave 4.0 marks a clear shift in Solana's networking approach, from "partition and protect" towards a model that attempts to maximise safe ingress. Here's how the protocol is evolving.
Quadrupling unstaked connection slots
While part of Agave 4.0, PR #9289 was also backported to Agave 3.1 and is already live on mainnet. It addresses the handshake churn by adjusting the connection limits.
- Total unstaked slots per validator increase from 500 to 2,000 (making the total 4,000, up from 2,500)
- That pushes the minimum number of entities required to saturate the unstaked pool from 62 to 250
The network will accommodate 4x as many distinct unstaked senders before hitting the same pressure point. This dramatically cuts the overhead spent on repeated handshakes and makes connections far more stable.
Latency compensation (RTT scaling)
In earlier Agave versions, the validator's QUIC stream limits assumed a round-trip time of roughly 50 ms. A sender connecting from further away hit the stream ceiling sooner, effectively losing throughput to latency alone.
PR #10144 fixes this by scaling the maximum concurrent QUIC streams in proportion to RTT, using 50 ms as a baseline. Connections with higher latency get proportionally larger stream limits, so distance from the leader no longer translates to lost throughput.
Removing TPS throttling for staked senders
PR #9580 is the most impactful change in Agave 4.0: it removes TPS throttling for staked senders entirely under normal load. In the old model, even when the TPU was nearly empty, staked senders still had to follow stake-proportional TPS quotas.
Now that staked senders can send unthrottled, the validator needs to monitor aggregate TPU throughput. And if it crosses 95% of the configured max (~450,000 TPS, depending on configuration), the system will fall back to stake-proportional allocation to protect staked traffic.
According to engineering reports, during a three-week test on mainnet nodes, this fallback was never triggered, proving that software was the bottleneck and that it has now been removed.
Note: If the trigger activates, it stays on for only a few hundred milliseconds at a time, long enough to drain the backlog before checking load levels again and switching back off.
Future protocol plans (PR #10666)
Looking ahead, PR #10666 (targeting Agave 4.1) proposes a fundamentally different approach to transaction processing.
Where the current system uses sleep-based delays to throttle senders (pausing a connection for up to 100 ms when quotas are exceeded), PR #10666 replaces that with QUIC-native flow control. Instead of the validator sleeping on a connection, it controls throughput by issuing QUIC stream credits to each sender. Each connection receives a credit budget scaled by round-trip time, so higher-latency senders aren't penalised for their distance from the leader. This also addresses the RTT blindness of earlier Agave versions, where a trader connecting from Frankfurt to a leader in New York could hit artificial throttling simply because the validator's buffers didn't account for latency.
Under normal conditions, any staked sender receives up to ~10,000 TPS per connection (1,024 stream credits at a 100 ms tick), regardless of their stake size, with the broader goal of letting all senders transmit freely when the TPU is not congested. When the TPU becomes congested, the system switches to stake-proportional allocation for staked senders (where your share scales with your stake) and immediately pauses unstaked traffic at 0 TPS until load drops, typically within milliseconds (the system rechecks every 10 ms).
A second part of this work (PR #11459, also targeting 4.1) adds an additional safety layer. Because QUIC credits can't be revoked once issued, senders who received generous quotas before a congestion spike can keep sending even after the system switches to proportional mode. The second PR handles this by adding per-sender budgets: if a sender exhausts their budget while the system is under heavy load, their connection can be hard-closed as a last resort. Both PRs are still in active review.
Do you need a stake identity?
Anza's 4.1 plans aim to substantially improve throughput for all senders, but right now stake weight still makes a significant difference in transaction delivery.
If you're a Triton customer, this is already handled for you: all Triton endpoints include SWQoS by default, so every customer gets the protection of staked sending. Here's what it gives you:
Protection during congestion and attacks. When the TPU reaches saturation, whether from a DDoS attack or a demand spike, the system shifts to stake-weighted allocation, pausing unstaked traffic until load subsides (typically within 100 milliseconds). While normal users can wait for the load to clear, that difference can be critical for latency-sensitive workflows.
Connection stability. With 2,000 staked slots shared among ~900 active validators, staked connections face far less competition for slots than the thousands of unstaked senders sharing their pool, making eviction pressure significantly lower on a staked connection.
Unthrottled throughput. Staked senders will have no TPS limits under normal load. Anza is working to bring unstaked throughput substantially closer in 4.1, though during congestion, the system still reverts to stake-proportional allocation, giving staked senders the edge.
How to optimise your write path
When you talk about transaction landing, there are always two goals: current block inclusion (getting the leader to process your transaction as soon as it reaches its TPU) and low latency (the time it takes for your transaction to reach the leader's TPU in the first place).
The first is still achieved through stake-weighted routing and priority fees. The second is where Jet comes in.
Yellowstone Jet TPU client
In 2026, there is zero reason for a trading-related workload to rely on pure RPC transaction sending.
It adds two sources of latency, starting with processing overhead (minimal, but still). Regular RPC nodes are usually busy serving read traffic, so your transaction packet can sit in a queue for a few milliseconds before the node forwards it. From there, every extra hop adds delay. If your backend, RPC node, and leader are not in the same city, you should rethink this setup.
Sending directly to the leader is the way to shrink latency to a minimum, but building a custom TPU client from scratch can be a lot of work. That's why we open-sourced and modularised our internal sending engine, Yellowstone Jet (read the deep dive), proven under millions of TPS in production.
Jet handles QUIC handshakes, connection caching without lock contention, transaction tracking, contact info overrides, sending to peers, multi-step identity updates, and Shield support (MEV-protection block and allowlists) out of the box.
But if you prefer a turnkey solution, we offer hosted Jet with stake-weighted routing included by default for all customer sendTransactions.
Conclusion
Anza's Sr. Networking Engineer, Alexander Pyattaev, summed up the SWQoS changes as follows: "They're not changing anything on the protocol level. We're just essentially fixing the system such that it behaves the way it was supposed to behave when it was first deployed two years ago".
He also shared the long-term target for the scheduler as a whole: "I would say that the ultimate goal is to be able to ingest every valid transaction that the internet can provide for us. So we would like to be able to ingest at line rate, preferably. If we have, let's say, 10 Gbits per second to spare, we should be able to ingest at 10 Gbits per second."
From Triton's side, we've already adapted to these changes. We retired Cascade Marketplace (learn more) because the combination of Jet's optimised routing and the Agave protocol improvements allows us to provide SWQoS to all our customers as part of the default service.
Frequently asked questions
What is SWQoS?
Stake-Weighted Quality of Service is a transaction traffic control mechanism, implemented via TPU configuration, that reserves 80% of a leader's capacity for packets coming from staked validators, in proportion to their stake. Unstaked peers share the remaining 20% and are typically limited to a maximum of 200 TPS per connection. This replaced the earlier first-come, first-served model in 2024, which reduced congestion and improved the network's resilience to spam and denial-of-service attacks, but also introduced strict and sometimes inefficient limits that Anza is now working to remove.
Is SWQoS being removed?
No, when it comes to Agave client. What's changing is the static, always-on function of the 80/20 connections partition, while SWQoS itself remains a core Solana protection mechanism. It will now activate the split only when a leader's TPU goes above the 95% capacity threshold. If you're on Triton, SWQoS is included by default in all your transaction sending.
What is the minimum stake required for SWQoS?
The SWQoS threshold is still ~1/50,000 of the total network stake. At the moment of writing this is roughly 8,500 SOL. Validators below this threshold are treated as unstaked for SWQoS scheduling, and their traffic competes in the public pool.
How do the Agave 4.0 changes benefit smaller staked validators?
If your stake is below the SWQoS threshold, your conditions don't change much, apart from the general improvement from fewer arbitrary evictions for all users. If your validator has stake above about 8,500 SOL, you can now use more connections in practice. You keep your stake-proportional share of the reserved TPS capacity, plus you can also use the public pool of 2,000 connections.
Can staked connections be evicted (when sending directly to TPU)?
On an Agave node, there are 2,000 reserved TPU QUIC connection slots for ~900 active staked validators as of this writing. It's much harder for ~900 active staked validators to saturate 2,000 reserved slots than for thousands of unstaked senders to saturate theirs. A single staked connection per validator won't be evicted. If a staked sender opens multiple connections, eviction becomes possible when all 2,000 reserved slots are occupied, but that's a significantly higher bar.
Why am I seeing disconnects on a staked connection (when sending directly to TPU)?
If you are above the stake-threshold but still see frequent disconnects, a common cause is a Jito Relayer configuration issue. Some operators run a relayer version with old streamer behaviour that closes connections after about 5 seconds. In that case, the instability comes from the relayer implementation on that validator, not from the SWQoS protocol or your stake. It is also possible that you are trying to establish more than 16 connections using your stake identity. If you are using Jet, you can get detailed metrics on disconnections to understand the issue.
Will these changes lower transaction fees?
No. SWQoS is separate from transaction and priority fees. Priority fees control ordering and inclusion in the banking stage inside the TPU, which decides which transactions are actually executed and included in a block and in what order. SWQoS controls packet admission and connection capacity under high load.
You can pay high-priority fees and still fail to land if your packets are dropped before they enter the TPU, and you can have SWQoS treatment but still lose to a transaction that bids higher-priority fees once both are inside the banking stage.
What should I check if my transactions are failing?
- Priority fees: check your compute unit price and total fee budget against current network load and increase it if your success rate drops during spikes (Free priority fee checker / Improved priority fee API)
- Endpoint quality: if you are using a free/cheap RPC service, expect higher drop rates due to traffic competing at the RPC level, not the leader.
- Connection type: confirm whether you are using a staked connection. If you're on Triton, you already have stake-weighted routing included by default. If you're sending via another provider without SWQoS, count with inevitable (now much less frequent) connection evictions
Where in the pipeline is SWQoS applied?
SWQoS only takes effect once packets reach the current leader's TPU ingress, before sigverify and banking. At that point, the leader allocates capacity between staked peers and the public pool according to SWQoS rules and configuration.