Back to feed
You can now determine the status and health of a TPU slice and partition by monitoring these new beta system metrics:
You can now determine the status and health of a TPU slice and partition by monitoring these new beta system metrics:
• kubernetes.io/accelerator/slice/state: Indicates the current status of the slice. • kubernetes.io/accelerator/partition/state: Indicates the health of the partition.
For more information, see the GKE system metrics documentation.