The NVIDIA L40S is a highly versatile GPU built for AI training, inference, graphics, and scientific workloads—all in one card.
On Novita AI, you can access the L40S for $0.55/hour. By comparison, RunPod lists the same GPU at $0.86/hour, making Novita AI a more cost-effective choice for high-performance computing in the cloud.

Novita AI

Runpod
What is L40S?
The NVIDIA L40S GPU, built on the Ada Lovelace architecture, is a powerhouse designed for demanding AI, graphics, and high-performance computing (HPC) workloads. What sets the L40S apart is its versatility, offering a balance of raw computational power for AI inference and training, professional visualization, and video processing tasks.

Key Performance Metrics
| Metric | Value |
|---|---|
| Tensor Cores | 568 (Fourth-Generation) |
| CUDA Cores | 18,176 |
| RT Cores | 142 (Third-Generation) |
| FP32 Performance | 90.5 TFLOPS |
| TF32 Performance (Dense) | 733 TFLOPS |
| TF32 Performance (Sparse) | 1466 TFLOPS |
| FP8 Performance (Dense) | 1466 TFLOPS |
| FP8 Performance (Sparse) | 2.93 PFLOPS |
| FP64 Performance | 1.4 TFLOPS |
| Memory Capacity | 48GB GDDR6 ECC |
| Memory Bandwidth | 1006 GB/s |
| TDP | 300W - 350W |
1. Core Computational Performance

2. Memory and Bandwidth
The L40S offers substantial memory and bandwidth, making it ideal for data-intensive workloads:
- Memory Capacity: With 48GB of GDDR6 ECC memory.
- Memory Bandwidth: The L40S provides a high 1006 GB/s memory bandwidth.
3. Multi-Instance GPU (MIG) Technology
The NVIDIA L40S does not support MIG.
4. FP64 Performance
While the L40S focuses more on AI, graphics, and general-purpose computing, it still offers 1.4 TFLOPS of FP64 (double-precision) performance.
While this performance level is lower compared to specialized GPUs like the H100, it is sufficient for certain scientific and engineering applications that require higher numerical precision.
Cost Efficiency of the L40S

While the L40S’s higher initial cost may seem considerable, it offers better cost efficiency over time in certain use cases, such as for enterprises, research institutions, and data centers requiring diverse and computationally intensive tasks. The L40S delivers long-term benefits that offset its higher upfront cost:
- Consolidation Capability: Handle more diverse tasks with fewer cards.
- Higher Productivity: Complete tasks faster, processing larger datasets and models.
- Lower Operational Costs: Save on electricity and cooling expenses.
- Improved Reliability & Availability: Reduce downtime and rework due to fewer hardware failures or data errors.
- Higher Resource Utilization: Enhance GPU efficiency through MIG (Multi-Instance GPU), allowing for better resource sharing.
In the long run, these factors contribute to a lower total cost of ownership (TCO), making the L40S a more cost-effective option for high-performance, multi-tasking environments.
Application of the L40S
Ultimate Versatility
The L40S combines strengths from all three areas—AI, graphics, and precision workloads—without the extreme specialization of other GPUs:
- Better Than H100 in graphics rendering and still efficient for mid-scale AI tasks.
- More Powerful Than Graphics Cards in AI, thanks to its Tensor Cores and large memory.
- Better Than Consumer GPUs with ECC memory, MIG support, and data-center reliability.

1. AI Training and Inference
- Training: With 48GB memory and 4th-gen Tensor Cores, the L40S can efficiently train large models like LLMs, computer vision, and recommendation systems.
- Inference: Offers high throughput and low latency, ideal for AI applications like image recognition, NLP, and real-time transcription.
2. Graphics and Visualization
- 3D Content Creation: Accelerates modeling, animation rendering, and VFX production.
- Real-time Ray Tracing: Provides top-tier virtual production capabilities, perfect for the film and broadcast industries.
- CAD/CAE/AEC: Delivers rapid rendering for engineering and architecture applications.
3. Precision Workloads
- Scientific Computing: FP32 performance supports CFD, FEA, and simulations in data analysis, genomics, and physical modeling.
How to run L40S at a very low price?
Novita AI provides a cloud-based platform with high-performance GPU instances. With powerful GPUs, it ensures efficient performance for complex tasks, enhances accessibility for deployment across various hardware, and offers a cost-effective solution compared to maintaining local hardware for large-scale AI deployments.
Step1:Register an account
Create your Novita AI account through our website. After registration, navigate to the “Explore” section in the left sidebar to view our GPU offerings and begin your AI development journey.

Step2:Exploring Templates and GPU Servers
Choose from templates like PyTorch, TensorFlow, or CUDA that match your project needs. Then select your preferred GPU configuration—options include the powerful L40S, RTX 4090 or A100 SXM4, each with different VRAM, RAM, and storage specifications.

Step3:Tailor Your Deployment
Customize your environment by selecting your preferred operating system and configuration options to ensure optimal performance for your specific AI workloads and development needs.

Step4:Launch an instance
Select “Launch Instance” to start your deployment. Your high-performance GPU environment will be ready within minutes, allowing you to immediately begin your machine learning, rendering, or computational projects.

The NVIDIA L40S GPU is a truly versatile choice for AI, graphics, and scientific computing. With powerful specs, 48GB ECC memory, and MIG support, it balances performance and cost for modern workloads. For those who want easy access without buying hardware, Novita AI offers cloud-based L40S instances—fast, flexible, and affordable.
Frequently Asked Questions
What makes the NVIDIA L40S GPU special?
It handles AI, graphics, and precision tasks all in one—something few GPUs can do.
Is the L40S good for AI training and inference?
Yes. Its Tensor Cores and 48GB memory make it ideal for both.
How can I try the L40S without buying it?
Use Novita AI to launch L40S cloud instances anytime—no setup needed.
[Novita AI](https://novita.ai/?utm_source=blogs_GPU&utm_medium=article&utm_campaign=NVIDIA A100 GPU Performance: Why It’s Still the Go-to Choice for AI Training) is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.
Recommended Reading
