Many companies that begin their AI projects in the cloud often reach a point when cost and time variables become issues. That’s typically due to the exponential growth in dataset size and complexity of AI models. “In an early phase, you might submit a job to the cloud where a training run would execute and the AI model would converge quickly,” says Tony Paikeday, senior director of AI systems at NVIDIA. “But as models and datasets grow, there’s a stifling effect associated with the escalating compute cost and time. Developers find that a training job now takes many hours or even days, and in t…