BONE-V1
본문
Overview
Real-time augmented reality (AR) is actively studied for the future user interface and experience in high-performance head-mounted display (HMD) systems. The small battery size and limited computing power of the current HMD, however, fail to implement the real-time markerless AR in the HMD. In this paper, we propose a real-time and low-power AR processor for advanced 3D-AR HMD applications. For the high throughput, the processor adopts task-level pipelined SIMD-PE clusters and a congestion-aware network-on-chip (NoC). Both of these two features exploit the high data-level parallelism (DLP) and task-level parallelism (TLP) with the pipelined multicore architecture. For the low power consumption, it employs a vocabulary forest accelerator and a mixed-mode support vector machine (SVM)-based DVFS control to reduce unnecessary external memory accesses and core activation. The proposed 4 mm x 8 mm HMD AR processor is fabricated using 65 nm CMOS technology for a battery-powered HMD platform with real-time AR operation. It consumes 381 mW average power and 778 mW peak power at 250 MHz operating frequency and 1.2 V supply voltage. It achieves 1.22 TOPS peak performance and 1.57 TOPS/W energy efficiency, which are, respectively, 3.58x and 1.18x higher than the state of the art.
Architecture
Features
a. Flexible Streaming Processor
- 10 Processors + 8 Channel Memories + Network-on-Chip
- Data sync. for producer-consumer stream processing
- NoC manages virtual memory channels among processors
b. Task-level parallelism
- 10 processors execute a different thread on the same or different data
- Reduction of the runtime of the execution
c. Limited computing power
- 1.4W at 1.8V(peak)