“Machine Learning runtime library for neural network acceleration”

*Patent pending*, Apr 2019

Aaron Ng

University of Michigan

Advanced Computer Architecture Lab

M.S.E. Computer Science and Engineering

Hardware Systems, Distributed Computing

Advisor: Igor L. Markov

hello@aaron-ng.com

A. N. Ng, J. Zejda, E. Delaye, X. Teng, S. Santan, S. T. Soe, A. Sirasao, E. Ghasemi, S. Settle

“Machine Learning runtime library for neural network acceleration”

A. N. Ng, E. Delaye, E. Ghasemi, X. Teng, J. Zejda, Y. Wu, S. Settle, A. Sirasao

“Multi-layer neural network processing by a neural network accelerator using host-communicated merged weights and a package of per-layer instructions”

A. N. Ng, E. Delaye, J. Zejda, A. Sirasao

“Host-directed multi-layer neural network processing via per-layer work requests”

A. N. Ng, J. Zejda, E. Delaye, X. Teng, A. Sirasao

“Neural network processing system having host-controlled kernel accelerators”

E. Delaye, A. Sirasao, A. Ng, Y. Wu, J. Zejda

“Image preprocessing for generalized image processing”

X. Teng, A. N. Ng, A. Sirasao, E. Delaye

“Neural network processing system having multiple processors and a neural network accelerator”

S. Settle, E. Delaye, A. N. Ng, E. Ghasemi, A. Sirasao, X. Teng, J. Zejda,

“Software-driven design optimization for mapping between floating-point and fixed-point multiply accumulators”

A. Sirasao, E. Delaye, A. N. Ng, E. Ghasemi,

“Inline image preprocessing for convolution operations using a matrix multiplier on an integrated circuit”

A. N. Ng, S. Krishnamurthy, G. S. Gasparyan,

“Timing Closure of Circuit Designs for Integrated Circuits”

E. Delaye, A. Sirasao, A. N. Ng

“Software-Defined Memory Bandwidth Reduction by Hierarchical Stream Buffering for General Matrix Multiplication In A Programmable IC”

A. N. Ng, P. Basu, S. Das

“Neural network based Physical Synthesis for Circuit Designs”

S. Settle, M. Bollavaram, P. D'Alberto, E. Delaye, O. Fernandez, N. Fraser, A. Ng, A. Sirasao, M. Wu

“Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines”

R. Lu, Z. Wang, A. N. Ng, N. Shah, S. Das

“Fanout Optimization to Facilitate Timing Improvement in Circuit Designs”

I. Ganusov, A. N. Ng, R. Plyler, S. Das and F. Revenu

“Programmable Integrated Circuit Design Flow using Timing-driven Pipeline Analysis”

R. Lu, Z. Wang, A. N. Ng and S. Das

“Post-routing Structural Netlist Optimization for Circuit Designs”

I. Ganusov, H. Fraisse, A. N. Ng, R. T. Possignolo and S. Das,

“Automated Extra Pipeline Analysis of Applications Mapped to Xilinx UltraScale+ FPGAs”

Q. Wang, A. N. Ng, R. Aggarwal,

“Resource Mapping of Functional Areas on an Integrated Circuit”

J. A. Roy, A. N. Ng, R. Aggarwal, V. Ramachandran, and I. L. Markov,

“Solving Modern Mixed-size Placement Instances”

A. N. Ng, R. Aggarwal, V. Ramachandran and I. L. Markov,

“Solving Hard Instances of Floorplacement,”

M. Moffitt, A. N. Ng and I. L. Markov,

“Constraint-driven Floorplan Repair,”

J. A. Roy, D. A. Papa, A. N. Ng, I. L Markov,

“Satisfying Whitespace Requirements in Top-down Placement,”

J. A. Roy, D. A. Papa, S. N. Adya, H. H. Chan, J. F. Lu, A. N. Ng, and I. L. Markov,

“Capo: Robust and Scalable Open-Source Min-cut Floorplacer,”

A. N. Ng and I. L. Markov,

“Toward Quality Tools and Tool Flows Through High-Performance Computing,”

