cerebrium.toml
file:
0.5
).
Hardware Type | Max CPU Cores | Max Memory (GB) |
---|---|---|
CPU Only | 48 | 96 |
ADA_L40 | 16 | 128 |
AMPERE_A100 | 12 | 140 |
AMPERE_A10 | 48 | 192 |
ADA_L4 | 48 | 192 |
TURING_T4 | 48 | 192 |
HOPPER_H100 | 24 | 256 |
TRN1 | 128 | 512 |
INF2 | 192 | 796 |
low_cpu_mem_usage
flag, which reduces memory-footprint at the cost of longer initialization times. Implementing lazy loading for large datasets can further optimize memory usage. Regular monitoring of memory patterns through platform metrics helps identify optimization opportunities. Memory-efficient model loading techniques should be considered for large-scale deployments.