Editorial matchup · June 2026

Etched vs Tenstorrent: Which AI Tool Is Better in 2026?

Side-by-side comparison of Etched and Tenstorrent — pricing, features, and use cases. Reviewed by our editorial team in Jun 2026.

Use-case score 11Updated Jun 2026
Etched logo

Etched

AI Infrastructure
4.5Paid80
Tenstorrent logo

Tenstorrent

AI Infrastructure
4.5Paid145
The verdictUse-case score · 11

Etched Sohu and Tenstorrent Blackhole represent fundamentally different bets on how to displace NVIDIA in AI inference hardware as of mid-2026.

Etched has chosen extreme specialization, burning transformer attention operations directly into silicon on TSMC 4nm, abandoning programmability entirely to squeeze maximum throughput per watt for dense transformer workloads.

Tenstorrent has chosen the opposite path: open RISC-V architecture, fully programmable Tensix cores, on-chip SRAM optimization, and a purpose-built open-source software stack (TT-Metal). Etched claims unmatched single-model throughput—500,000 tokens/sec on Llama 70B with eight chips at 90%+ FLOPS utilization.

But those numbers remain unverified in production, the chip is not yet shipping to customers as of April 2026, and the architecture cannot run mixture-of-experts models, state-space models, or any non-transformer workload.

The trade-off is existential: if transformers remain dominant for 5+ years, Sohu's cost-per-token advantage could compound; if the industry shifts to MoE or SSMs, the hardware becomes obsolete.

Tenstorrent Blackhole, shipping in volume as of April 2026, offers programmability, backward compatibility with open frameworks, and the ability to run diverse workloads from LLM inference to video generation.

Its Galaxy Blackhole system (32 chips) delivers 23.8 petaFLOPS FP8 deployable on Ethernet mesh without proprietary interconnect. However, per-node bandwidth is lower (GDDR6 vs HBM3E), software stack is younger than CUDA, and raw performance still lags NVIDIA H100/B200 for high-batch workloads.

Etched targets pure-transformer shops at hyperscale where architectural lock-in is acceptable; Tenstorrent targets research labs, sovereign AI programs requiring open stacks, and teams willing to trade raw throughput for flexibility and reduced vendor dependence.

T
ToolDirectory.AIEditorial Team

Maximum dense transformer throughput at small batch sizes

Etched

Etched Sohu claims 500,000 tokens/sec on Llama 70B with eight chips and 90%+ FLOPS utilization by hardwiring attention operations as fixed silicon, though claims remain unverified in production deployment.

Flexible, multi-workload inference platforms

Tenstorrent

Tenstorrent Blackhole ships production-ready, running video generation, LLMs, and diverse models on fully open TT-Metal software stack; zero lock-in to transformer architecture.

Near-term deployment and availability

Tenstorrent

Tenstorrent Galaxy Blackhole in volume production and deployed at Cirrascale, Equinix, and ai& as of April 2026; Etched Sohu not yet shipping to customers despite Series B funding.

Section 01

Best for what

4 use cases scored. Etched wins 1, Tenstorrent wins 1.

  • Pricing value

    Neither tool publishes a starting price.

    Even
  • Free tier

    Neither tool offers a free tier or trial.

    Even
  • User ratings

    Etched averages 4.5 / 5 vs 4.5 / 5 on the other side.

    Etched
  • Review volume

    Tenstorrent has 146 ratings vs 90 on the other.

    Tenstorrent
Section 02

Pros & cons

Where each tool earns its rating — and where it falls short.

Etched logo

Etched

AI Infrastructure
Pros
  • Fixed-function transformer attention circuits deliver unmatched FLOPS utilization at 90%+ versus GPUs at 30-40%, potentially enabling 20x H100 speedup for dense transformer inference at optimal batch sizes.
  • HBM3E memory integration with 144GB total capacity and high bandwidth, matching NVIDIA B200 memory subsystem and reducing reliance on external DRAM.
  • TSMC 4nm die shrink enables reticle-limit density and thermal efficiency comparable to NVIDIA Blackwell, maximizing compute per watt in fixed physical space.
  • Backed by Series B funding providing runway for TSMC fab capacity, compiler development, and early customer ramps.
  • Transformer dominance across frontier labs (DeepSeek, Qwen, Claude, GPT-4) validates core thesis that dense transformer specialization remains viable for 5+ years.
Cons
  • Not shipping to customers as of April 2026 despite public announcements, leaving performance claims unverified by independent benchmarks or production deployments.
  • Cannot run mixture-of-experts (MoE), state-space models (Mamba), or any non-transformer architecture; leading open models (DeepSeek V4, Qwen3) are MoE-based and incompatible.
  • No software abstraction layer or compiler flexibility; transformer attention operations hardwired as static circuits means future attention optimizations (FlashAttention-3, new KV cache formats) cannot be deployed without new silicon.
  • Requires proprietary Etched compiler; no CUDA, ROCm, or vLLM support means complete software stack migration for teams with mature GPU deployments.
  • Architecture risk: if transformer dominance wavers or model inference patterns shift (longer context, sparse attention, conditional routing), Sohu becomes a sunk cost with no recovery path.
Section 03

At a glance

Every spec on one page. Live-pulled from each tool's detail page.

  • Pricing
    Paid
    Paid
  • Pricing model
    Paid
    Paid
  • Free tier
    No
    No
  • Free trial
    No
    No
  • Rating
    4.5 / 5 (90 ratings)
    4.5 / 5 (146 ratings)
  • Saves
    80
    145
  • Categories
    AI Infrastructure, Engineering & Simulation
    AI Infrastructure, Engineering & Simulation
  • Verified
    No
    No
  • Top 100 tier
  • Last updated
    May 2026
    Jun 2026
Frequently asked

Etched vs Tenstorrent FAQs

Quick answers to the questions readers ask before picking between these two.

Is Etched Sohu shipping today?

No, as of April 2026 Sohu has not shipped to customers despite investor demonstrations and Series B funding. The chip was announced in 2022 and demonstrated in controlled benchmarks, but remains pre-production. This makes all performance claims theoretical and unverified at production scale, creating material execution risk.

Can Etched Sohu run Mixture-of-Experts models?

No, Sohu cannot run MoE architectures because the hardware does not support dynamic expert routing. DeepSeek V4 and Qwen3, the two most widely deployed open-weight models as of April 2026, are MoE-based and incompatible with Sohu. This represents a permanent hardware limitation.

How does Tenstorrent's software stack compare to NVIDIA CUDA?

Tenstorrent offers TT-Metal, an open-source, low-level programming interface giving full ISA access and direct hardware control, unlike CUDA's higher-level abstractions. This enables research and custom kernel development impossible on CUDA but requires expertise and longer development cycles.

What is Tenstorrent Blackhole's total memory bandwidth compared to H100?

A single Tenstorrent Wormhole n300 card has 24GB GDDR6 at approximately 576GB/s, which is roughly 6x lower bandwidth than an H100 SXM5 with 80GB HBM3 at 3.35TB/s. On-chip SRAM compensates for small-batch inference where the working set fits in local memory.

Which chip is better for training large models?

Tenstorrent Blackhole supports both training and inference via TT-Metal; hyperscalers have tested training workloads on Galaxy Blackhole clusters. Etched Sohu is inference-only and cannot train models. For training workloads, Tenstorrent is the only choice between the two.

Can I use Tenstorrent Blackhole on cloud platforms like AWS or GCP?

Not as of April 2026. Tenstorrent hardware is sold direct and deployed on-premises at partner datacenters (Cirrascale, Equinix, ai&) or purchased standalone. No major cloud provider offers Galaxy Blackhole as an on-demand service yet, whereas NVIDIA H100/B200 are available hourly on AWS, GCP, and Azure.

What happens if transformer models get replaced by a new architecture?

Etched Sohu becomes obsolete and can be repurposed only as general compute, if at all. The fixed-function silicon has no way to adapt. Tenstorrent Blackhole, being programmable and open, can be retargeted to new architectures via software updates and kernel rewrites, preserving hardware value over time.

Bottom line

Choose Etched Sohu only if you operate a hyperscale inference service locked to dense transformer-only workloads (Llama, Mistral, GPT-family models) with extreme cost-per-token requirements and can tolerate a company bet that has not yet shipped production hardware.

Sohu's 20x H100 throughput claim, if validated, justifies the architectural lock-in. But without independent benchmarks, no shipping date, and MoE models already dominating production, the risk profile is venture-capital asymmetry, not operations-ready infrastructure.

Choose Tenstorrent Blackhole if you prioritize production-ready hardware shipped and deployed at scale, need flexibility across multiple workload types (inference, training, video generation), or operate under regulatory or geopolitical requirements favoring open and auditable AI infrastructure.

Blackhole trades raw throughput for operational maturity, open standards, and the ability to adapt as model architectures evolve. For enterprise teams, Tenstorrent is lower-risk; for hyperscalers betting on transformer stability and chasing order-of-magnitude cost reduction, Etched is higher-upside but unproven.

For cost-conscious teams, NVIDIA H100/B200 spot pricing on cloud marketplaces remains cheaper and more flexible than either. Watch Etched's customer announcements in Q3-Q4 2026; if volume production and independent benchmarks emerge, the narrative shifts.

Related matchups

Keep comparing

More ai infrastructure head-to-heads.

Sign up for our newsletter

Receive weekly updates so you can stay up-to-date with the world of AI