Back to Blog
Partnership
Modal
GPU Compute
Startup Program

ThinnestAI × Modal: Awarded Modal for Startups Credits for GPU Compute

T
Thinnest AI Team
Mar 20, 2026 4 min read
ThinnestAI × Modal: Awarded Modal for Startups Credits for GPU Compute
ThinnestAI × Modal

Serverless GPUs for AI Innovation

Today, we're excited to share that Thinnest AI has received Modal for Startups credits — giving us access to serverless GPU compute for our most demanding AI workloads.

Modal has quietly become the go-to platform for AI teams that need GPU compute without the headache of managing Kubernetes clusters, Docker images, or cloud GPU quotas. With Modal, we can run a fine-tuning job on 4x A100 GPUs with a single Python decorator — and pay only for the seconds we actually use.

Why Modal?

GPU compute is the lifeblood of AI development, but traditional cloud GPUs come with painful trade-offs: long provisioning times, complex orchestration, and minimum commitments that don't make sense for a startup. Modal eliminates all of that.

1. Zero Infrastructure Management

With Modal, there are no Dockerfiles to maintain, no Kubernetes configs to debug, no GPU drivers to install. You write a Python function, add a @modal.function(gpu="A100") decorator, and it runs on cloud GPUs instantly. From code to GPU in seconds, not hours.

2. Pay-Per-Second Pricing

Traditional cloud GPUs charge by the hour — even if your job finishes in 12 minutes. Modal charges by the second with no minimum commitment. For a startup running intermittent fine-tuning jobs, this can reduce GPU costs by 70-80% compared to reserved instances.

3. Instant Scaling

Need to process 10,000 knowledge base documents for vector embeddings? Modal can spin up 100 containers in parallel, process everything in minutes, and scale back to zero when done. Batch jobs that used to take hours now finish in minutes.

What We're Running on Modal

Modal powers several critical AI workloads in our pipeline:

  • TTS Model Fine-Tuning: Fine-tuning text-to-speech models for Hindi and Indian languages using LoRA adapters on high-memory GPUs
  • Batch Embedding Generation: When customers upload large knowledge bases (PDFs, documents, websites), we generate vector embeddings in parallel on Modal for fast indexing
  • Model Evaluation: Running evaluation benchmarks across multiple model variants to find the best configuration for each language and use case
  • Experimental Inference: Testing new model architectures and configurations before deploying to production

What This Means for Our Users

Modal credits translate directly into better AI agents for our users:

  • Better voice quality: More fine-tuning iterations mean more natural-sounding Hindi and Indian language TTS
  • Faster knowledge indexing: Large document uploads are processed in minutes instead of hours
  • More model options: We can test and validate new models faster, bringing the best options to our agent builder sooner
  • Lower costs: Efficient GPU usage means we can keep our prices low while improving quality

Try It Yourself

The AI agents powered by our Modal-accelerated models are live. Sign up and experience the difference:

Start Building Free →

25 free voice minutes • 200 chat messages • No credit card required

Thank You, Modal

We're grateful to the Modal team for supporting our mission. Serverless GPUs are a game-changer for AI startups, and we're excited to push the boundaries of what's possible with their platform.

— The Thinnest AI Team

Frequently Asked Questions

Subscribe to our newsletter

Get the latest AI updates delivered directly to your inbox.

ThinnestAI Receives Modal for Startups Credits for GPU Compute | Thinnest AI Blog