Skymizer has introduced the HTX301, a PCIe AI accelerator capable of running massive 700 billion parameter language models locally while using just 240 watts, leveraging decade-old chip and memory technology for efficient LLM inference.

  • Runs 700 billion parameter LLMs with 240 W power draw
  • Uses six 28nm chips with LPDDR4/5 memory on a PCIe card
  • Supports local AI inference in standard air-cooled servers

What happened

Skymizer, a Taiwanese AI accelerator startup, has launched the HTX301, a PCIe card designed to run massive large language models with up to 700 billion parameters. Unlike most modern AI accelerators that use cutting-edge chip fabrication and high-bandwidth memory (HBM), this card employs older 28nm chip technology coupled with LPDDR4 and LPDDR5 memory. This combination allows the card to achieve significant performance while consuming only 240 watts of power.

The HTX301 features six chips working cooperatively, providing up to 384 GB of total memory and operating within a PCIe form factor that fits standard air-cooled servers. Skymizer claims it delivers 30 tokens per second with just 0.5 TOPS performance at 100 GB per second bandwidth, facilitated by efficient compression techniques for model weights and KV cache. The card is expected to preview publicly at Computex, enabling independent validation of its performance claims.

Why it matters

This development challenges industry leaders like AMD and Nvidia, whose latest AI accelerators consume significantly more power—AMD’s Instinct MI350P draws considerably more watts, and Nvidia’s RTX PRO 6000 Blackwell consumes around 600 watts. Skymizer’s approach promises a dramatic reduction in power consumption for running ultra-large language models, potentially lowering the cost and environmental impact of AI inference.

For enterprises, the HTX301 offers an alternative to costly, scale-out cloud-based GPU infrastructures. Running LLMs on-premises with this PCIe card promotes data sovereignty and predictable operational costs without needing expensive upgrades to power and cooling systems. This could accelerate broader adoption of agentic AI applications in coding, automation, and specialized workflows in environments where cloud reliance is a concern.

What to watch next

The key factor moving forward will be independent performance verification of the HTX301 under real-world conditions. While Skymizer’s specifications look promising, actual throughput, latency, and power efficiency tests will determine if the card can meet or exceed its claims, especially for workloads like Llama2 7B and larger. Success would position Skymizer as a disruptive new player in the AI hardware sector.

Industry observers should also watch how AMD, Nvidia, and other established AI accelerator vendors respond to this challenge. If Skymizer’s product proves effective and scalable, it may inspire renewed interest in leveraging older chip technologies combined with software optimizations to balance cost, power, and performance in AI inference. Follow-up developments will likely emerge around broader ecosystem support, pricing, and adoption by enterprises seeking low-power, on-premises AI acceleration.

Source assisted: This briefing began from a discovered source item from TechRadar. Open the original source.
How SignalDesk reports: feeds and outside sources are used for discovery. Public briefings are edited to add context, buyer relevance and attribution before they are published. Read the standards

Related briefings