Tinygrad is a neural network framework designed for simplicity and speed, breaking down complex networks into three operation types: ElementwiseOps, ReduceOps, and MovementOps. ElementwiseOps include operations like SQRT and ADD, ReduceOps perform functions like SUM and MAX on a single tensor, and MovementOps manage data movement without copying, utilizing ShapeTrack.
tinygrad.org
5 min
3/21/2026
nCPU is a CPU architecture that operates entirely on GPU, utilizing tensors for registers, memory, flags, and the program counter. All arithmetic operations, including addition, multiplication, bitwise operations, and shifts, are performed through trained neural networks, with specific methods like Kogge-Stone carry-lookahead for addition and learned byte-pair lookup tables for multiplication.
github.com
8 min
3/4/2026
Tinygrad is a neural network framework designed for simplicity and speed, breaking down complex networks into three operation types: ElementwiseOps, ReduceOps, and MovementOps. ElementwiseOps include operations like SQRT and ADD, ReduceOps perform functions like SUM and MAX on a single tensor, and MovementOps manage data movement without copying, utilizing ShapeTrack.
tinygrad.org
5 min
3/21/2026
nCPU is a CPU architecture that operates entirely on GPU, utilizing tensors for registers, memory, flags, and the program counter. All arithmetic operations, including addition, multiplication, bitwise operations, and shifts, are performed through trained neural networks, with specific methods like Kogge-Stone carry-lookahead for addition and learned byte-pair lookup tables for multiplication.
github.com
8 min
3/4/2026
Tinygrad is a neural network framework designed for simplicity and speed, breaking down complex networks into three operation types: ElementwiseOps, ReduceOps, and MovementOps. ElementwiseOps include operations like SQRT and ADD, ReduceOps perform functions like SUM and MAX on a single tensor, and MovementOps manage data movement without copying, utilizing ShapeTrack.
tinygrad.org
5 min
3/21/2026
nCPU is a CPU architecture that operates entirely on GPU, utilizing tensors for registers, memory, flags, and the program counter. All arithmetic operations, including addition, multiplication, bitwise operations, and shifts, are performed through trained neural networks, with specific methods like Kogge-Stone carry-lookahead for addition and learned byte-pair lookup tables for multiplication.
github.com
8 min
3/4/2026
No more articles to load