Modal is a serverless cloud platform for running GPU and CPU workloads. It lets you run Python functions on cloud infrastructure — including A100 and H100 GPUs — without managing servers, containers, or orchestration.

How does Thinnest AI use Modal?

We use Modal for GPU-intensive workloads like TTS model fine-tuning for Indian languages, batch embedding generation for knowledge bases, and on-demand model inference for specialized AI tasks.

What is the Modal for Startups program?

Modal for Startups provides eligible early-stage companies with compute credits that last for a year, along with access to Modal's community Slack for direct engineering support.

Does this affect pricing for Thinnest AI users?

No. Modal credits allow us to run more experiments, fine-tune better models, and optimize inference — all improvements that benefit users at no additional cost.

Back to Blog

ThinnestAI × Modal: Awarded Modal for Startups Credits for GPU Compute

Thinnest AI Team

Mar 20, 2026• 4 min read

ThinnestAI × Modal

Serverless GPUs for AI Innovation

Today, we're excited to share that Thinnest AI has received Modal for Startups credits — giving us access to serverless GPU compute for our most demanding AI workloads.

Modal has quietly become the go-to platform for AI teams that need GPU compute without the headache of managing Kubernetes clusters, Docker images, or cloud GPU quotas. With Modal, we can run a fine-tuning job on 4x A100 GPUs with a single Python decorator — and pay only for the seconds we actually use.

Why Modal?

GPU compute is the lifeblood of AI development, but traditional cloud GPUs come with painful trade-offs: long provisioning times, complex orchestration, and minimum commitments that don't make sense for a startup. Modal eliminates all of that.

1. Zero Infrastructure Management

With Modal, there are no Dockerfiles to maintain, no Kubernetes configs to debug, no GPU drivers to install. You write a Python function, add a @modal.function(gpu="A100") decorator, and it runs on cloud GPUs instantly. From code to GPU in seconds, not hours.

2. Pay-Per-Second Pricing

Traditional cloud GPUs charge by the hour — even if your job finishes in 12 minutes. Modal charges by the second with no minimum commitment. For a startup running intermittent fine-tuning jobs, this can reduce GPU costs by 70-80% compared to reserved instances.

3. Instant Scaling

Need to process 10,000 knowledge base documents for vector embeddings? Modal can spin up 100 containers in parallel, process everything in minutes, and scale back to zero when done. Batch jobs that used to take hours now finish in minutes.

What We're Running on Modal

Modal powers several critical AI workloads in our pipeline:

TTS Model Fine-Tuning: Fine-tuning text-to-speech models for Hindi and Indian languages using LoRA adapters on high-memory GPUs
Batch Embedding Generation: When customers upload large knowledge bases (PDFs, documents, websites), we generate vector embeddings in parallel on Modal for fast indexing
Model Evaluation: Running evaluation benchmarks across multiple model variants to find the best configuration for each language and use case
Experimental Inference: Testing new model architectures and configurations before deploying to production

What This Means for Our Users

Modal credits translate directly into better AI agents for our users:

Better voice quality: More fine-tuning iterations mean more natural-sounding Hindi and Indian language TTS
Faster knowledge indexing: Large document uploads are processed in minutes instead of hours
More model options: We can test and validate new models faster, bringing the best options to our agent builder sooner
Lower costs: Efficient GPU usage means we can keep our prices low while improving quality

Try It Yourself

The AI agents powered by our Modal-accelerated models are live. Sign up and experience the difference:

Start Building Free →

25 free voice minutes • 200 chat messages • No credit card required

Thank You, Modal

We're grateful to the Modal team for supporting our mission. Serverless GPUs are a game-changer for AI startups, and we're excited to push the boundaries of what's possible with their platform.

— The Thinnest AI Team

ThinnestAI × Modal: Awarded Modal for Startups Credits for GPU Compute

Serverless GPUs for AI Innovation

Why Modal?

1. Zero Infrastructure Management

2. Pay-Per-Second Pricing

3. Instant Scaling

What We're Running on Modal

What This Means for Our Users

Try It Yourself

Thank You, Modal

Frequently Asked Questions

Related documentation

Subscribe to our newsletter

Related reading

Platform

Docs

ThinnestAI × Modal: Awarded Modal for Startups Credits for GPU Compute

Serverless GPUs for AI Innovation

Why Modal?

1. Zero Infrastructure Management

2. Pay-Per-Second Pricing

3. Instant Scaling

What We're Running on Modal

What This Means for Our Users

Try It Yourself

Thank You, Modal

Frequently Asked Questions

What is Modal?

How does Thinnest AI use Modal?

What is the Modal for Startups program?

Does this affect pricing for Thinnest AI users?

Related documentation

Subscribe to our newsletter

Related reading