Performance per dollar for AI inference is improving, with GLM5.2 served on AMD MI355X achieving 2626 tokens per second per node and 213 tokens per second in a single stream at over 2x lower cost than Blackwell. Demand for inference is increasing rapidly, outpacing supply, as new frontier models are released frequently.
wafer.ai
5 min
15h ago
Performance per dollar for AI inference is improving, with GLM5.2 served on AMD MI355X achieving 2626 tokens per second per node and 213 tokens per second in a single stream at over 2x lower cost than Blackwell. Demand for inference is increasing rapidly, outpacing supply, as new frontier models are released frequently.
wafer.ai
5 min
15h ago
Performance per dollar for AI inference is improving, with GLM5.2 served on AMD MI355X achieving 2626 tokens per second per node and 213 tokens per second in a single stream at over 2x lower cost than Blackwell. Demand for inference is increasing rapidly, outpacing supply, as new frontier models are released frequently.
wafer.ai
5 min
15h ago
No more articles to load