65 GOPS/neuron Photonic Tensor Core with Thin-film Lithium Niobate Photonics

التفاصيل البيبلوغرافية
العنوان: 65 GOPS/neuron Photonic Tensor Core with Thin-film Lithium Niobate Photonics
المؤلفون: Lin, Zhongjin, Shastri, Bhavin J., Yu, Shangxuan, Song, Jingxiang, Zhu, Yuntao, Safarnejadian, Arman, Cai, Wangning, Lin, Yanmei, Ke, Wei, Hammood, Mustafa, Wang, Tianye, Xu, Mengyue, Zheng, Zibo, Al-Qadasi, Mohammed, Esmaeeli, Omid, Rahim, Mohamed, Pakulski, Grzegorz, Schmid, Jens, Barrios, Pedro, Jiang, Weihong, Morison, Hugh, Mitchell, Matthew, Qiang, Xiaogang, Guan, Xun, Jaeger, Nicolas A. F., Rusch, Leslie A. n, Shekhar, Sudip, Shi, Wei, Yu, Siyuan, Cai, Xinlun, Chrostowski, Lukas
سنة النشر: 2023
المجموعة: Computer Science
Physics (Other)
مصطلحات موضوعية: Physics - Optics, Computer Science - Emerging Technologies, Physics - Applied Physics, 78A05
الوصف: Photonics offers a transformative approach to artificial intelligence (AI) and neuromorphic computing by providing low latency, high bandwidth, and energy-efficient computations. Here, we introduce a photonic tensor core processor enabled by time-multiplexed inputs and charge-integrated outputs. This fully integrated processor, comprising only two thin-film lithium niobate (TFLN) modulators, a III-V laser, and a charge-integration photoreceiver, can implement an entire layer of a neural network. It can execute 65 billion operations per second (GOPS) per neuron, including simultaneous weight updates-a hitherto unachieved speed. Our processor stands out from conventional photonic processors, which have static weights set during training, as it supports fast "hardware-in-the-loop" training, and can dynamically adjust the inputs (fan-in) and outputs (fan-out) within a layer, thereby enhancing its versatility. Our processor can perform large-scale dot-product operations with vector dimensions up to 131,072. Furthermore, it successfully classifies (supervised learning) and clusters (unsupervised learning) 112*112-pixel images after "hardware-in-the-loop" training. To handle "hardware-in-the-loop" training for clustering AI tasks, we provide a solution for multiplications involving two negative numbers based on our processor.
Comment: 19 pages, 6 figures
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2311.16896
رقم الأكسشن: edsarx.2311.16896
قاعدة البيانات: arXiv