I will provide an AI image & video generation API — text-to-image and image-to-video
About this gig
Ship generative visuals straight from your app: one REST endpoint turns prompts into images and stills into short motion clips, with an instant API key and pay-per-render billing.
This is a self-serve AI image and video generation API for builders who want text-to-image and image-to-video without standing up GPUs, queues, or model weights. You send a prompt (and optionally a source image), we return finished media. No infrastructure, no model fine-tuning, no cold-start headaches — just an HTTP call and a render back.
What you get
- Two core endpoints under one key.
POST /v1/text-to-imagetakes a text prompt and returns a generated image.POST /v1/image-to-videotakes a source image (URL or upload) and animates it into a short motion clip. - Instant API key on order. Your key and base URL are issued immediately after purchase — no waitlist, no sales call, no manual provisioning. Paste it into an
Authorization: Bearerheader and you're live. - Plain REST + JSON. Every request is a standard HTTPS call. If your stack can make an HTTP request, it can call this API — Node, Python, PHP, Go, Ruby, Rust, no-code tools like Make and n8n, or raw cURL.
- Per-render billing. You are metered by what you actually generate. One successful render is one unit; failed requests aren't charged. No idle GPU rental, no monthly minimums hiding in the fine print.
- Prompt controls for text-to-image. Set output dimensions / aspect ratio, choose a seed for reproducible results, request multiple variations per call, and pass negative prompts to steer away from unwanted elements.
- Motion controls for image-to-video. Drive a still into a short animated clip with control over clip length and motion intensity, returning a standard MP4 you can host or stream anywhere.
- Sync or async retrieval. Short jobs can return inline; longer renders return a job ID you poll on
GET /v1/jobs/{id}so your request threads never block waiting on a GPU. - Direct file output. Results come back as hosted URLs (or base64 when you prefer to keep bytes in-house), ready to drop into an
<img>/<video>tag, a CDN, or object storage. - Quick-start snippets. Copy-paste cURL and JavaScript/Python examples so the first successful render happens in minutes, not days.
- Usage visibility. Track your render counts so you always know where your consumption stands.
Plans
All tiers share the same endpoints, the same instant key, and both text-to-image and image-to-video. Tiers differ by render volume, concurrency, and support depth — not by feature lockout.
| Tier | Best for | Included |
|---|---|---|
| Starter | Prototypes, side projects, proof-of-concept demos | Text-to-image + image-to-video access, single API key, baseline render volume, standard resolution outputs, modest concurrency, community/email docs support |
| Growth | Live apps and SaaS features in production | Everything in Starter, higher monthly render allotment, increased concurrent requests, higher-resolution image output and longer video clips, priority queue placement, faster email support |
| Scale | High-traffic platforms, agencies, white-label media pipelines | Everything in Growth, top-tier render volume, maximum concurrency, longest clips and largest dimensions, multiple keys for environment separation, priority support and onboarding help |
No price, tokens, or amounts are listed here on purpose — message me with your expected monthly render volume and I'll point you to the right tier.
How it works
- Order the tier that matches your expected volume. Checkout is self-serve.
- Receive your API key and base URL instantly in your order delivery — no provisioning delay.
- Drop the key into an
Authorization: Bearer <key>header and grab the matching cURL/JS/Python snippet from the quick-start. - Call
text-to-imagewith a prompt (plus size, seed, or variation count) orimage-to-videowith a source image and motion settings. - Read the result inline, or poll the job ID for longer renders until status is
completed. - Pull the returned URL (or base64) into your UI, CDN, or storage — and repeat at whatever scale your tier allows.
Why choose this
- Zero infra to babysit. No GPU rental, no model downloads, no autoscaling group to tune at 3am. The endpoint is the product.
- Truly instant access. The key is delivered the moment you order, so you can be rendering before your coffee cools.
- One key, two modalities. Text-to-image and image-to-video live behind a single credential, so a single integration covers both still and motion needs.
- Honest metering. You pay per successful render; failed calls don't bill. Costs track your real usage instead of reserved capacity.
- Integration-agnostic. Plain REST means it slots into existing backends, serverless functions, automation tools, and no-code workflows alike — no proprietary SDK required.
- Built to scale with you. Start on a prototype tier and move up to high concurrency and bigger outputs without re-architecting your integration — the contract stays the same.
Who it's for / use cases
- SaaS product teams adding an "generate an image" or "animate this" button without hiring an ML team.
- Marketing and ad-tech tools producing on-brand creative variations at volume from text prompts.
- E-commerce platforms turning flat product photos into short motion clips for listings, ads, and social.
- Social and content apps offering users avatar art, scene generation, or image-to-video effects in-app.
- Agencies and freelancers white-labeling a media pipeline into client deliverables without owning GPUs.
- Automation builders wiring text-to-image and image-to-video steps into Make, n8n, or Zapier-style flows.
- Game and prototyping studios spinning up concept art and animated mockups on demand.
FAQ
Q: How fast do I get access after ordering? Immediately. Your API key and base URL are issued the moment your order completes — there's no manual approval or waitlist, so you can make your first call right away.
Q: Do I need to install an SDK? No. Everything is plain REST over HTTPS with JSON. Any language or tool that can send an HTTP request works, and copy-paste cURL/JS/Python snippets are included to get you started.
Q: How am I billed? Per render. Each successful image or video generation counts as usage; requests that fail aren't charged. Your tier sets your volume and concurrency.
Q: What's the difference between the endpoints? text-to-image turns a written prompt into a generated image. image-to-video takes an existing image and animates it into a short motion clip. Both are available on every tier under the same key.
Q: Can I get reproducible images? Yes. The text-to-image endpoint accepts a seed parameter, so passing the same prompt and seed lets you regenerate consistent results, and you can request multiple variations per call.
Q: How do I handle longer renders without my request timing out? Use the async flow: longer jobs return a job ID, and you poll GET /v1/jobs/{id} until the status is completed. This keeps your own request threads from blocking on a GPU.
Q: What format do results come back in? Images and videos are returned as hosted URLs by default, or as base64 if you'd rather keep the bytes in your own infrastructure. Videos come back as standard MP4 you can host or stream anywhere.
Q: Can I upgrade later without rewriting my integration? Yes. All tiers share the same endpoints and request format — moving from Starter to Growth or Scale raises your volume, concurrency, resolution, and clip length while your existing code keeps working unchanged.
Reviews★5(1)
- @dan360★★★★★5
Plugged the API into our app and the text-to-image endpoint returned clean results fast, and the image-to-video conversion worked exactly as described. Docs were clear and integration took less time than I expected.