Ceva has extended its Ceva-NeuPro family of Edge AI NPUs by introducing the Ceva-NeuPro-Nano NPUs. These highly-efficient, self-sufficient NPUs are designed to deliver the power, performance, and cost efficiencies required for semiconductor companies and OEMs to integrate TinyML models into their SoCs for consumer, industrial, and general-purpose AIoT products. TinyML refers to deploying machine learning models on low-power, resource-constrained devices, bringing the power of AI to the Internet of Things (IoT).
Driven by the increasing demand for efficient and specialized AI solutions in IoT devices, the market for TinyML is rapidly growing. According to ABI Research, by 2030, over 40% of TinyML shipments will be powered by dedicated TinyML hardware rather than all-purpose MCUs. The Ceva-NeuPro-Nano NPUs address the specific performance challenges of TinyML, aiming to make AI ubiquitous, economical, and practical for a wide range of use cases, including voice, vision, predictive maintenance, and health sensing in consumer and industrial IoT applications.The new Ceva-NeuPro-Nano Embedded AI NPU architecture is fully programmable and efficiently executes neural networks, feature extraction, control code, and DSP code. It supports advanced machine learning data types and operators, including native transformer computation, sparsity acceleration, and fast quantization. The optimized, self-sufficient architecture enables superior power efficiency, a smaller silicon footprint, and optimal performance compared to existing processor solutions for TinyML workloads. Additionally, Ceva-NetSqueeze AI compression technology processes compressed model weights directly, achieving up to 80% memory footprint reduction, addressing a key bottleneck in the adoption of AIoT processors.
Key Features of Ceva-NeuPro-Nano NPUs:
- Fully programmable for neural networks, feature extraction, control code, and DSP code.
- Scalable performance with configurations up to 64 int8 MACs per cycle.
- Supports advanced ML data types and operators, including 4-bit to 32-bit integers and native transformer computation.
- Advanced mechanisms like sparsity acceleration, non-linear activation acceleration, and fast quantization.
- Single-core design eliminates the need for a companion MCU for computational tasks.
- Ceva-NetSqueeze technology reduces memory footprint by up to 80%.
- Innovative energy optimization techniques, including automatic on-the-fly energy tuning and weight-sparsity acceleration.
- Ceva-NeuPro Studio provides a unified AI stack and supports open AI frameworks like TensorFlow Lite for Microcontrollers and microTVM.
- Fast time to market with a Model Zoo of pretrained and optimized TinyML models.
- Optimized runtime libraries and application-specific software.