
Side-by-side comparison of Etched and Tenstorrent — pricing, features, and use cases. Reviewed by our editorial team in Jun 2026.


Etched Sohu and Tenstorrent Blackhole represent fundamentally different bets on how to displace NVIDIA in AI inference hardware as of mid-2026.
Etched has chosen extreme specialization, burning transformer attention operations directly into silicon on TSMC 4nm, abandoning programmability entirely to squeeze maximum throughput per watt for dense transformer workloads.
Tenstorrent has chosen the opposite path: open RISC-V architecture, fully programmable Tensix cores, on-chip SRAM optimization, and a purpose-built open-source software stack (TT-Metal). Etched claims unmatched single-model throughput—500,000 tokens/sec on Llama 70B with eight chips at 90%+ FLOPS utilization.
But those numbers remain unverified in production, the chip is not yet shipping to customers as of April 2026, and the architecture cannot run mixture-of-experts models, state-space models, or any non-transformer workload.
The trade-off is existential: if transformers remain dominant for 5+ years, Sohu's cost-per-token advantage could compound; if the industry shifts to MoE or SSMs, the hardware becomes obsolete.
Tenstorrent Blackhole, shipping in volume as of April 2026, offers programmability, backward compatibility with open frameworks, and the ability to run diverse workloads from LLM inference to video generation.
Its Galaxy Blackhole system (32 chips) delivers 23.8 petaFLOPS FP8 deployable on Ethernet mesh without proprietary interconnect. However, per-node bandwidth is lower (GDDR6 vs HBM3E), software stack is younger than CUDA, and raw performance still lags NVIDIA H100/B200 for high-batch workloads.
Etched targets pure-transformer shops at hyperscale where architectural lock-in is acceptable; Tenstorrent targets research labs, sovereign AI programs requiring open stacks, and teams willing to trade raw throughput for flexibility and reduced vendor dependence.
Maximum dense transformer throughput at small batch sizes
Etched Sohu claims 500,000 tokens/sec on Llama 70B with eight chips and 90%+ FLOPS utilization by hardwiring attention operations as fixed silicon, though claims remain unverified in production deployment.
Flexible, multi-workload inference platforms
Tenstorrent Blackhole ships production-ready, running video generation, LLMs, and diverse models on fully open TT-Metal software stack; zero lock-in to transformer architecture.
Near-term deployment and availability
Tenstorrent Galaxy Blackhole in volume production and deployed at Cirrascale, Equinix, and ai& as of April 2026; Etched Sohu not yet shipping to customers despite Series B funding.
4 use cases scored. Etched wins 1, Tenstorrent wins 1.
Neither tool publishes a starting price.
Neither tool offers a free tier or trial.
Etched averages 4.5 / 5 vs 4.5 / 5 on the other side.
Tenstorrent has 146 ratings vs 90 on the other.
Where each tool earns its rating — and where it falls short.



Every spec on one page. Live-pulled from each tool's detail page.
Quick answers to the questions readers ask before picking between these two.
No, as of April 2026 Sohu has not shipped to customers despite investor demonstrations and Series B funding. The chip was announced in 2022 and demonstrated in controlled benchmarks, but remains pre-production. This makes all performance claims theoretical and unverified at production scale, creating material execution risk.
No, Sohu cannot run MoE architectures because the hardware does not support dynamic expert routing. DeepSeek V4 and Qwen3, the two most widely deployed open-weight models as of April 2026, are MoE-based and incompatible with Sohu. This represents a permanent hardware limitation.
Tenstorrent offers TT-Metal, an open-source, low-level programming interface giving full ISA access and direct hardware control, unlike CUDA's higher-level abstractions. This enables research and custom kernel development impossible on CUDA but requires expertise and longer development cycles.
A single Tenstorrent Wormhole n300 card has 24GB GDDR6 at approximately 576GB/s, which is roughly 6x lower bandwidth than an H100 SXM5 with 80GB HBM3 at 3.35TB/s. On-chip SRAM compensates for small-batch inference where the working set fits in local memory.
Tenstorrent Blackhole supports both training and inference via TT-Metal; hyperscalers have tested training workloads on Galaxy Blackhole clusters. Etched Sohu is inference-only and cannot train models. For training workloads, Tenstorrent is the only choice between the two.
Not as of April 2026. Tenstorrent hardware is sold direct and deployed on-premises at partner datacenters (Cirrascale, Equinix, ai&) or purchased standalone. No major cloud provider offers Galaxy Blackhole as an on-demand service yet, whereas NVIDIA H100/B200 are available hourly on AWS, GCP, and Azure.
Etched Sohu becomes obsolete and can be repurposed only as general compute, if at all. The fixed-function silicon has no way to adapt. Tenstorrent Blackhole, being programmable and open, can be retargeted to new architectures via software updates and kernel rewrites, preserving hardware value over time.
Choose Etched Sohu only if you operate a hyperscale inference service locked to dense transformer-only workloads (Llama, Mistral, GPT-family models) with extreme cost-per-token requirements and can tolerate a company bet that has not yet shipped production hardware.
Sohu's 20x H100 throughput claim, if validated, justifies the architectural lock-in. But without independent benchmarks, no shipping date, and MoE models already dominating production, the risk profile is venture-capital asymmetry, not operations-ready infrastructure.
Choose Tenstorrent Blackhole if you prioritize production-ready hardware shipped and deployed at scale, need flexibility across multiple workload types (inference, training, video generation), or operate under regulatory or geopolitical requirements favoring open and auditable AI infrastructure.
Blackhole trades raw throughput for operational maturity, open standards, and the ability to adapt as model architectures evolve. For enterprise teams, Tenstorrent is lower-risk; for hyperscalers betting on transformer stability and chasing order-of-magnitude cost reduction, Etched is higher-upside but unproven.
For cost-conscious teams, NVIDIA H100/B200 spot pricing on cloud marketplaces remains cheaper and more flexible than either. Watch Etched's customer announcements in Q3-Q4 2026; if volume production and independent benchmarks emerge, the narrative shifts.
More ai infrastructure head-to-heads.
Receive weekly updates so you can stay up-to-date with the world of AI
Receive weekly updates so you can stay up-to-date with the world of AI