Home > Research > HI Systems > Research Projects

< BONE-V1 >


  For mobile intelligent robot applications, an 81.6 GOPS object recognition processor is implemented. Based on an analysis of the target application, the chip architecture and hardware features are decided. The proposed processor aims to support both task-level and data-level parallelism. Ten processing elements are integrated for the task-level parallelism and single instruction multiple data (SIMD) instruction is added to exploit the data-level parallelism. The Memory-Centric network-on-chip7 (NoC) is proposed to support efficient pipelined task execution using the ten processing elements. It also provides coherence and consistency schemes tailored for 1-to-N and M-to-1 data transactions in a task-level pipeline. For further performance gain, the visual image processing memory is also implemented. The chip is fabricated in a 0.18- m CMOS technology and computes the key-point localization stage of the SIFT object recognition twice faster than the 2.3 GHz Core 2 Duo processor.

Implementation results

Performance comparison


  10PEs & 8 Visual Image processing memories and hierarchical star topology NoC


a. Flexible Streaming Processor
  - 10 Processors + 8 Channel Memories + Network-on-Chip
  - Data sync. for producer-consumer stream processing
  - NoC manages virtual memory channels among processors

b. Task-level parallelism
  - 10 processors execute a different thread on the same or different data
  - Reduction of the runtime of the execution

c. Limited computing power
  - 1.4W at 1.8V(peak)

Related Papers

  - DAC 08 [pdf]

  - A-SSCC 07 [pdf]

  - CICC 07 [pdf]

  - NOCS 07 [pdf]

  - IET CDT 09 [pdf]

  - TVLSI 09 [pdf]

#1233, School of Electrical Engineering, KAIST, 291 Daehak-ro (373-1 Guseong-dong), Yuseong-gu,
Daejeon 34141, Republic of Korea / Tel. +82-42-350-8068 / Fax. +82-42-350-3410 / Mail: sslmaster@kaist.ac.kr
Copyright (C) 2017, SEMICONDUCTOR SYSTEM LAB., All Rights Reserved.