Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog. Kunal

Challenges & Research Directions for LLM Inference Hardware

Notes

Inference basics

Decode challenge 1: memory

end2end latency

Research opportunities

For modern systems design

directions in the paper