Etched vs Tenstorrent (2026 Review)

Section 01

Best for what

4 use cases scored. Etched wins 1, Tenstorrent wins 1.

Pricing value
Neither tool publishes a starting price.
Even
Free tier
Neither tool offers a free tier or trial.
Even
User ratings
Etched averages 4.5 / 5 vs 4.5 / 5 on the other side.
Etched
Review volume
Tenstorrent has 146 ratings vs 90 on the other.
Tenstorrent

Section 02

Pros & cons

Where each tool earns its rating — and where it falls short.

Etched

AI Infrastructure

Pros

Fixed-function transformer attention circuits deliver unmatched FLOPS utilization at 90%+ versus GPUs at 30-40%, potentially enabling 20x H100 speedup for dense transformer inference at optimal batch sizes.
HBM3E memory integration with 144GB total capacity and high bandwidth, matching NVIDIA B200 memory subsystem and reducing reliance on external DRAM.
TSMC 4nm die shrink enables reticle-limit density and thermal efficiency comparable to NVIDIA Blackwell, maximizing compute per watt in fixed physical space.
Backed by Series B funding providing runway for TSMC fab capacity, compiler development, and early customer ramps.
Transformer dominance across frontier labs (DeepSeek, Qwen, Claude, GPT-4) validates core thesis that dense transformer specialization remains viable for 5+ years.

Cons

Not shipping to customers as of April 2026 despite public announcements, leaving performance claims unverified by independent benchmarks or production deployments.
Cannot run mixture-of-experts (MoE), state-space models (Mamba), or any non-transformer architecture; leading open models (DeepSeek V4, Qwen3) are MoE-based and incompatible.
No software abstraction layer or compiler flexibility; transformer attention operations hardwired as static circuits means future attention optimizations (FlashAttention-3, new KV cache formats) cannot be deployed without new silicon.
Requires proprietary Etched compiler; no CUDA, ROCm, or vLLM support means complete software stack migration for teams with mature GPU deployments.
Architecture risk: if transformer dominance wavers or model inference patterns shift (longer context, sparse attention, conditional routing), Sohu becomes a sunk cost with no recovery path.

Etched

AI Infrastructure

Pros

Fixed-function transformer attention circuits deliver unmatched FLOPS utilization at 90%+ versus GPUs at 30-40%, potentially enabling 20x H100 speedup for dense transformer inference at optimal batch sizes.
HBM3E memory integration with 144GB total capacity and high bandwidth, matching NVIDIA B200 memory subsystem and reducing reliance on external DRAM.
TSMC 4nm die shrink enables reticle-limit density and thermal efficiency comparable to NVIDIA Blackwell, maximizing compute per watt in fixed physical space.
Backed by Series B funding providing runway for TSMC fab capacity, compiler development, and early customer ramps.
Transformer dominance across frontier labs (DeepSeek, Qwen, Claude, GPT-4) validates core thesis that dense transformer specialization remains viable for 5+ years.

Cons

Not shipping to customers as of April 2026 despite public announcements, leaving performance claims unverified by independent benchmarks or production deployments.
Cannot run mixture-of-experts (MoE), state-space models (Mamba), or any non-transformer architecture; leading open models (DeepSeek V4, Qwen3) are MoE-based and incompatible.
No software abstraction layer or compiler flexibility; transformer attention operations hardwired as static circuits means future attention optimizations (FlashAttention-3, new KV cache formats) cannot be deployed without new silicon.
Requires proprietary Etched compiler; no CUDA, ROCm, or vLLM support means complete software stack migration for teams with mature GPU deployments.
Architecture risk: if transformer dominance wavers or model inference patterns shift (longer context, sparse attention, conditional routing), Sohu becomes a sunk cost with no recovery path.

Tenstorrent

AI Infrastructure

Pros

Fully open-source TT-Metal software stack and open RISC-V architecture eliminate vendor lock-in; enterprises building sovereign AI or EU AI Act-compliant systems benefit from auditable compute stacks.
Shipping in production volume as of April 2026 with deployment at major colocation providers (Cirrascale, Equinix, ai&); Galaxy Blackhole delivers 23.8 petaFLOPS FP8 in a single 6U system.
Programmable architecture supports diverse workloads: LLM inference, video generation (720p faster than real-time on 4-node supercluster), and custom kernels via Python-based TT-Metal interface.
Reduced vendor dependencies and openness align with emerging regulatory requirements and geopolitical goals; Canadian company faces fewer export restrictions than NVIDIA.
Led by Jim Keller (Apple A4/A5, AMD Zen, Intel chip strategy), bringing legendary chip architecture expertise and credibility; Series D funding at multi-billion valuation from Jeff Bezos and Samsung.

Cons

GDDR6 memory subsystem delivers 6x lower bandwidth per unit versus H100 HBM3, making high-batch inference bottlenecked on memory bandwidth for large KV caches.
Software stack significantly less mature than CUDA; TT-Buda deprecated as of early 2026, requiring migration to TT-Metal; limited model support compared to vLLM ecosystem.
Raw single-GPU performance trails NVIDIA by 1.5-2x at high batch sizes; Grayskull showed competitive power efficiency (1.55 TFLOPs/Watt) relative to A100, not dominance.
Ethernet-based interconnect avoids NVLink complexity but requires dense Ethernet mesh for cluster scaling, adding networking cost and operational complexity.
Requires on-premise deployment and direct purchase; no cloud rental options like H100/B200 on-demand, limiting experimentation for smaller teams.

Section 03

At a glance

Every spec on one page. Live-pulled from each tool's detail page.

Spec

Etched

Tenstorrent

Pricing
Paid
Paid
Pricing model
Paid
Paid
Free tier
No
No
Free trial
No
No
Rating
4.5 / 5 (90 ratings)
4.5 / 5 (146 ratings)
Saves
80
145
Categories
AI Infrastructure, Engineering & Simulation
AI Infrastructure, Engineering & Simulation
Verified
No
No
Top 100 tier
—
—
Last updated
May 2026
Jun 2026

Frequently asked

Etched vs Tenstorrent FAQs

Quick answers to the questions readers ask before picking between these two.

Is Etched Sohu shipping today?

No, as of April 2026 Sohu has not shipped to customers despite investor demonstrations and Series B funding. The chip was announced in 2022 and demonstrated in controlled benchmarks, but remains pre-production. This makes all performance claims theoretical and unverified at production scale, creating material execution risk.

Can Etched Sohu run Mixture-of-Experts models?

No, Sohu cannot run MoE architectures because the hardware does not support dynamic expert routing. DeepSeek V4 and Qwen3, the two most widely deployed open-weight models as of April 2026, are MoE-based and incompatible with Sohu. This represents a permanent hardware limitation.

How does Tenstorrent's software stack compare to NVIDIA CUDA?

Tenstorrent offers TT-Metal, an open-source, low-level programming interface giving full ISA access and direct hardware control, unlike CUDA's higher-level abstractions. This enables research and custom kernel development impossible on CUDA but requires expertise and longer development cycles.

What is Tenstorrent Blackhole's total memory bandwidth compared to H100?

A single Tenstorrent Wormhole n300 card has 24GB GDDR6 at approximately 576GB/s, which is roughly 6x lower bandwidth than an H100 SXM5 with 80GB HBM3 at 3.35TB/s. On-chip SRAM compensates for small-batch inference where the working set fits in local memory.

Which chip is better for training large models?

Tenstorrent Blackhole supports both training and inference via TT-Metal; hyperscalers have tested training workloads on Galaxy Blackhole clusters. Etched Sohu is inference-only and cannot train models. For training workloads, Tenstorrent is the only choice between the two.

Can I use Tenstorrent Blackhole on cloud platforms like AWS or GCP?

Not as of April 2026. Tenstorrent hardware is sold direct and deployed on-premises at partner datacenters (Cirrascale, Equinix, ai&) or purchased standalone. No major cloud provider offers Galaxy Blackhole as an on-demand service yet, whereas NVIDIA H100/B200 are available hourly on AWS, GCP, and Azure.

What happens if transformer models get replaced by a new architecture?

Etched Sohu becomes obsolete and can be repurposed only as general compute, if at all. The fixed-function silicon has no way to adapt. Tenstorrent Blackhole, being programmable and open, can be retargeted to new architectures via software updates and kernel rewrites, preserving hardware value over time.

Bottom line

Choose Etched Sohu only if you operate a hyperscale inference service locked to dense transformer-only workloads (Llama, Mistral, GPT-family models) with extreme cost-per-token requirements and can tolerate a company bet that has not yet shipped production hardware.

Sohu's 20x H100 throughput claim, if validated, justifies the architectural lock-in. But without independent benchmarks, no shipping date, and MoE models already dominating production, the risk profile is venture-capital asymmetry, not operations-ready infrastructure.

Choose Tenstorrent Blackhole if you prioritize production-ready hardware shipped and deployed at scale, need flexibility across multiple workload types (inference, training, video generation), or operate under regulatory or geopolitical requirements favoring open and auditable AI infrastructure.

Blackhole trades raw throughput for operational maturity, open standards, and the ability to adapt as model architectures evolve. For enterprise teams, Tenstorrent is lower-risk; for hyperscalers betting on transformer stability and chasing order-of-magnitude cost reduction, Etched is higher-upside but unproven.

For cost-conscious teams, NVIDIA H100/B200 spot pricing on cloud marketplaces remains cheaper and more flexible than either. Watch Etched's customer announcements in Q3-Q4 2026; if volume production and independent benchmarks emerge, the narrative shifts.

July 2026 status check: Etched's June 30 stealth exit ($1B+ in stated contracts, summer rack shipments) converts part of the venture bet into a delivery test — the Q3-Q4 watch item is now independent benchmarks, not announcements.

Tenstorrent's Galaxy GA strengthens the production-ready case; verify current p150 core counts (120 since January) when comparing spec sheets.

Related matchups

Keep comparing

More ai infrastructure head-to-heads.

AI Infrastructure

vs

Etched vs Tenstorrent: Which AI Tool Is Better in 2026?

Etched

Tenstorrent

Etched

Tenstorrent

Tenstorrent

Best for what

Pros & cons

Etched

Etched

Tenstorrent

At a glance

Etched vs Tenstorrent FAQs

Is Etched Sohu shipping today?

Can Etched Sohu run Mixture-of-Experts models?

How does Tenstorrent's software stack compare to NVIDIA CUDA?

What is Tenstorrent Blackhole's total memory bandwidth compared to H100?

Which chip is better for training large models?

Can I use Tenstorrent Blackhole on cloud platforms like AWS or GCP?

What happens if transformer models get replaced by a new architecture?

Bottom line

Keep comparing

Cerebras vs Etched

Etched vs Groq

Etched vs SambaNova

Etched vs FriendliAI

SambaNova vs Tenstorrent

Cerebras vs Tenstorrent

Sign up for our newsletter

Sign up for our newsletter

AI Tools Directory

Explore

Latest collections

Policy