Etched vs FriendliAI (2026 Review)

Section 01

Best for what

4 use cases scored. Etched wins 1, FriendliAI wins 1.

Pricing value
Neither tool publishes a starting price.
Even
Free tier
Neither tool offers a free tier or trial.
Even
User ratings
Etched averages 4.5 / 5 vs 4.5 / 5 on the other side.
Etched
Review volume
FriendliAI has 125 ratings vs 90 on the other.
FriendliAI

Section 02

Pros & cons

Where each tool earns its rating — and where it falls short.

Etched

AI Infrastructure

Pros

Transformer-specific ASIC architecture with hardwired attention circuits claims 20x throughput advantage over H100 by eliminating instruction fetch and scheduling overhead inherent to general-purpose GPUs.
144GB HBM3E memory per chip on TSMC 4nm reticle-limit die provides ample context window and KV-cache capacity for frontier models like Llama 70B.
If production timeline is met and performance validated, represents potential order-of-magnitude CapEx reduction for inference-only operations at hyperscale deployment.
Foundational bet on transformer stability could unlock edge-deployment and humanoid-robot real-time inference where latency and power efficiency are critical.

Cons

Not available for purchase or rental as of April 2026—more than 20 months after announcement. No independent benchmarks exist; all performance claims come from Etched's own marketing materials and controlled demos.
Requires complete serving stack rebuild with Etched's proprietary compiler. No vLLM, TensorRT-LLM, or CUDA support—migration from established GPU frameworks carries high implementation and technical risk.
Cannot run MoE with dynamic expert routing, diffusion models, vision transformers, or any non-transformer architecture. A single architecture pivot by the AI community leaves hardware obsolete.
Startup execution risk: founded 2022 with no prior shipped silicon products. Supply chain constraints on HBM3E memory and TSMC yield on reticle-limit die could delay ramp significantly.
No published pricing, no disclosed TDP, server architecture unclear, no enterprise support roadmap disclosed.

Etched

AI Infrastructure

Pros

Transformer-specific ASIC architecture with hardwired attention circuits claims 20x throughput advantage over H100 by eliminating instruction fetch and scheduling overhead inherent to general-purpose GPUs.
144GB HBM3E memory per chip on TSMC 4nm reticle-limit die provides ample context window and KV-cache capacity for frontier models like Llama 70B.
If production timeline is met and performance validated, represents potential order-of-magnitude CapEx reduction for inference-only operations at hyperscale deployment.
Foundational bet on transformer stability could unlock edge-deployment and humanoid-robot real-time inference where latency and power efficiency are critical.

Cons

Not available for purchase or rental as of April 2026—more than 20 months after announcement. No independent benchmarks exist; all performance claims come from Etched's own marketing materials and controlled demos.
Requires complete serving stack rebuild with Etched's proprietary compiler. No vLLM, TensorRT-LLM, or CUDA support—migration from established GPU frameworks carries high implementation and technical risk.
Cannot run MoE with dynamic expert routing, diffusion models, vision transformers, or any non-transformer architecture. A single architecture pivot by the AI community leaves hardware obsolete.
Startup execution risk: founded 2022 with no prior shipped silicon products. Supply chain constraints on HBM3E memory and TSMC yield on reticle-limit die could delay ramp significantly.
No published pricing, no disclosed TDP, server architecture unclear, no enterprise support roadmap disclosed.

FriendliAI

AI Infrastructure

Pros

Production-ready with active deployments: powering inference for LG Electronics (EXAONE models), Scatter Lab, Upstage, and 25-30 enterprise customers as of August 2025.
Friendli Engine with iteration batching and speculative decoding delivers 3x faster throughput than vLLM and claimed 50-90% cost reduction versus Together AI and Fireworks on equivalent workloads.
Flexible three-tier deployment: serverless API for easy adoption, dedicated endpoints for predictable workloads, Friendli Container for on-prem data sovereignty—fits enterprises, startups, and hybrid architectures.
Supports 570,000+ Hugging Face models plus custom fine-tuned and proprietary models; works with vision, multimodal, MoE, LoRA adapters, and quantized models without architectural lock-in.

Cons

Bounded by GPU hardware throughput—fundamental NVIDIA H100/B200 memory-bandwidth limits mean throughput gains plateau where Etched claims orders of magnitude advantage.
Cost reduction claims of 50-90% depend on algorithmic optimizations and batching efficiency, not architectural advantage—replicable by competitors with sufficient engineering resources.
Not yet profitable despite 6-7x revenue growth in 2025. Heavy R&D spend needed to maintain performance lead as GPU efficiency improves and competitors adopt similar techniques.
Competitive moat tied to proprietary engineering (iteration batching, speculative decoding, TCache, Friendli DNN Library)—advantage degrades if techniques are published or adopted industry-wide.

Section 03

At a glance

Every spec on one page. Live-pulled from each tool's detail page.

Spec

Etched

FriendliAI

Pricing
Paid
Paid
Pricing model
Paid
Paid
Free tier
No
No
Free trial
No
No
Rating
4.5 / 5 (90 ratings)
4.5 / 5 (125 ratings)
Saves
80
110
Categories
AI Infrastructure, Engineering & Simulation
AI Infrastructure, AI/ML Models
Verified
No
No
Top 100 tier
—
—
Last updated
May 2026
Jul 2026

Frequently asked

Etched vs FriendliAI FAQs

Quick answers to the questions readers ask before picking between these two.

Is Etched Sohu available to rent or purchase today?

No. As of April 2026, Sohu has not shipped to customers and is not available for purchase or cloud rental. Etched has demonstrated the chip to investors and shown controlled benchmarks, but there is no public production deployment, no independent verification of performance claims, and no announced availability date.

Which platform has verified, independent benchmarks?

FriendliAI. Third-party comparisons from September 2024 and later studies confirm its 3x throughput improvement over vLLM on Llama benchmarks and lowest TTFT at 0.24 seconds among tested inference providers. Etched publishes only its own benchmarks; no third-party testing has validated its 500,000 tokens/sec claim on Llama 70B.

Can either platform run mixture-of-experts models?

FriendliAI supports MoE models with dynamic expert routing, quantized models, and LoRA adapters natively on standard GPU infrastructure. Etched's Sohu cannot support MoE with dynamic routing; Etched has mentioned a separate variant for fixed-MoE architectures, but that chip is not yet available.

What is the difference between Etched hardware and FriendliAI software?

Etched builds transformer-specific ASIC silicon that hardwires attention, projection, and normalization operations for maximum compute density. FriendliAI runs on standard NVIDIA GPUs and optimizes inference through software techniques like iteration batching and speculative decoding. Etched aims for 20x throughput advantage; FriendliAI aims for 3x improvement over vLLM on the same GPUs.

How much does each platform cost for inference?

FriendliAI's pricing is transparent: usage-based per GPU hours or per token, with claimed reductions of 50-90% versus Together AI and Fireworks for equivalent workloads. Etched has not published pricing and is not available for purchase, so cost-per-token comparisons cannot be made. Once Sohu ships, competitive pricing will depend on manufacturing yield and scale.

Which platform supports more AI models?

FriendliAI supports 570,000+ models from Hugging Face, plus custom fine-tuned, proprietary, and multimodal models. Etched supports only dense transformer models and cannot run diffusion, vision, MoE-routed, or SSM-based models like Mamba.

What happens if AI architectures shift away from transformers?

FriendliAI remains viable because it runs on general-purpose GPU silicon; you can immediately switch to new model families without hardware changes. Etched's entire value proposition depends on transformers remaining dominant. If architectures shift, Etched would require a full hardware redesign, a process the company estimates at 3+ years.

Bottom line

Choose FriendliAI for immediate, production-grade cost and latency reduction on transformer workloads running at scale today.

Teams operating Llama 70B inference across thousands of requests should evaluate FriendliAI's cost-per-token math against existing GPU infrastructure, especially if running vLLM or TensorRT-LLM at batching factors where continuous-batching and speculative-decoding optimizations yield measurable savings.

Choose Etched only if you operate at datacenter scale (5,000+ GPU equivalents), can absorb 18-24 month wait-and-see risk on production availability and performance validation, are willing to rebuild serving infrastructure with proprietary tooling, and your workloads are 100% transformer-only with zero need for multimodal, vision, or model-architecture flexibility.

Etched targets the same pain point—transformer inference CapEx—but from a hardware bet rather than software optimization, making it a long-cycle strategy for organizations with patient capital and high conviction on transformer stability. For enterprises making decisions in 2026, FriendliAI is the operational choice.

For hyperscalers or frontier labs betting 2-3 years ahead, Etched represents a potential category shift—but unproven and unshipped.

Related matchups

Keep comparing

More ai infrastructure head-to-heads.

AI Infrastructure

vs

Etched vs FriendliAI: Which AI Tool Is Better in 2026?

Etched

FriendliAI

FriendliAI

Etched

FriendliAI

Best for what

Pros & cons

Etched

Etched

FriendliAI

At a glance

Etched vs FriendliAI FAQs

Is Etched Sohu available to rent or purchase today?

Which platform has verified, independent benchmarks?

Can either platform run mixture-of-experts models?

What is the difference between Etched hardware and FriendliAI software?

How much does each platform cost for inference?

Which platform supports more AI models?

What happens if AI architectures shift away from transformers?

Bottom line

Keep comparing

Cerebras vs Etched

Etched vs Groq

Etched vs SambaNova

Etched vs Tenstorrent

Cerebras vs FriendliAI

FriendliAI vs Groq

Sign up for our newsletter

Sign up for our newsletter

AI Tools Directory

Explore

Latest collections

Policy