Originally published at ffmpeg-micro.com
You need to join two or more video files into one. Maybe you're stitching together clips from a multi-camera shoot, assembling a highlight reel, or building an automated video pipeline. FFmpeg can do it, but the syntax trips up even experienced developers.
FFmpeg has three different ways to concatenate videos: the concat demuxer, the concat filter, and the concat protocol. Each one works differently, and picking the wrong one will give you corrupted output, missing audio, or a file that won't play. This guide covers all three methods, shows you when to use each, and includes working API examples so you can skip the CLI entirely.
The Concat Demuxer (Same Codec, Fastest Method)
The concat demuxer is the fastest way to merge videos because it doesn't re-encode. It copies the streams directly, which means zero quality loss and near-instant processing. The catch: all your input files need the same codec, resolution, and frame rate.
Create a text file listing your inputs:
# concat-list.txt
file 'clip1.mp4'
file 'clip2.mp4'
file 'clip3.mp4'
Then run:
ffmpeg -f concat -safe 0 -i concat-list.txt -c copy output.mp4
The -c copy flag tells FFmpeg to copy streams without re-encoding. The -safe 0 flag allows absolute file paths. This approach processes a 10-minute video in seconds because it's just copying data.
When the concat demuxer breaks
If your clips have different codecs (one is H.264, another is H.265), different resolutions, or different audio sample rates, the demuxer will either fail or produce a broken file. You won't always get a clear error message. Sometimes the output just freezes at the transition point between clips.
The Concat Filter (Different Codecs or Resolutions)
When your input files don't match, you need the concat filter. It re-encodes everything, which takes longer but handles mismatched inputs gracefully.
Here's how to merge two videos with different codecs:
ffmpeg \
-i clip1.mp4 \
-i clip2.mov \
-filter_complex "[0:v][0:a][1:v][1:a]concat=n=2:v=1:a=1[v][a]" \
-map "[v]" -map "[a]" \
-c:v libx264 -crf 23 -preset medium \
-c:a aac -b:a 192k \
output.mp4
The n=2 tells FFmpeg you have 2 segments. v=1 means one video stream per segment, a=1 means one audio stream. If your clips don't have audio, drop the audio parts:
ffmpeg \
-i clip1.mp4 \
-i clip2.mp4 \
-filter_complex "[0:v][1:v]concat=n=2:v=1[v]" \
-map "[v]" \
-c:v libx264 -crf 23 -preset medium \
-an \
output.mp4
Merging three or more videos
Scaling beyond two clips follows the same pattern. For three clips with audio:
ffmpeg \
-i clip1.mp4 \
-i clip2.mov \
-i clip3.avi \
-filter_complex "[0:v][0:a][1:v][1:a][2:v][2:a]concat=n=3:v=1:a=1[v][a]" \
-map "[v]" -map "[a]" \
-c:v libx264 -crf 23 \
-c:a aac -b:a 192k \
output.mp4
The Concat Protocol (MPEG-TS Only)
The concat protocol is the simplest syntax but the most limited. It only works with MPEG-TS files:
ffmpeg -i "concat:segment1.ts|segment2.ts|segment3.ts" -c copy output.ts
You'll rarely use this unless you're working with HLS segments or transport streams. For most video formats (MP4, MOV, WebM), use the demuxer or filter instead.
Which Method Should You Use?
| Scenario | Method | Why |
|---|---|---|
| Same codec, resolution, frame rate | Concat demuxer | No re-encoding, fastest |
| Different codecs or resolutions | Concat filter | Handles mismatches, re-encodes |
| Need transitions (fade, slide) | Concat filter + xfade | Only filter supports transitions |
| MPEG-TS segments | Concat protocol | Simplest for transport streams |
| Automated pipeline | FFmpeg Micro API | No CLI, handles infrastructure |
Concatenating Videos with the FFmpeg Micro API
If you're building a pipeline that merges videos programmatically, you don't need to manage FFmpeg on a server. FFmpeg Micro is a cloud API that lets you add video processing to any app with a single HTTP call. No FFmpeg installation, no server management.
The API accepts multiple inputs natively. Pass all your video URLs in the inputs array and use filters for the concat filter:
curl -X POST https://api.ffmpeg-micro.com/v1/transcodes \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"inputs": [
{"url": "https://storage.example.com/clip1.mp4"},
{"url": "https://storage.example.com/clip2.mp4"}
],
"outputFormat": "mp4",
"filters": [
{"filter": "[0:v][0:a][1:v][1:a]concat=n=2:v=1:a=1[v][a]"}
],
"options": [
{"option": "-map", "argument": "[v]"},
{"option": "-map", "argument": "[a]"}
]
}'
For three clips, add another input and adjust the filter:
curl -X POST https://api.ffmpeg-micro.com/v1/transcodes \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"inputs": [
{"url": "https://storage.example.com/clip1.mp4"},
{"url": "https://storage.example.com/clip2.mp4"},
{"url": "https://storage.example.com/clip3.mov"}
],
"outputFormat": "mp4",
"filters": [
{"filter": "[0:v][0:a][1:v][1:a][2:v][2:a]concat=n=3:v=1:a=1[v][a]"}
],
"options": [
{"option": "-map", "argument": "[v]"},
{"option": "-map", "argument": "[a]"}
]
}'
The API supports up to 10 input files per request. Check job status with GET /v1/transcodes/{id} and download the result with GET /v1/transcodes/{id}/download.
Common Pitfalls When Concatenating Videos
Resolution mismatch silently fails. If you use the concat demuxer with files at different resolutions, the output might play fine in VLC but break on mobile or in browsers. Always check with ffprobe -v error -select_streams v:0 -show_entries stream=width,height on each input.
Audio channel mismatch causes sync drift. One clip in stereo, another in mono. FFmpeg won't always warn you. The output audio can go out of sync after the first transition. Fix it by normalizing audio channels before concatenating:
ffmpeg -i input.mp4 -ac 2 -c:v copy normalized.mp4
Missing audio stream crashes the concat filter. If one clip has audio and another doesn't, the concat filter fails because it expects the same number of streams per segment. Either add a silent audio track to the video-only clip or use v=1:a=0 and handle audio separately.
Container format matters. MP4 puts the moov atom at the end by default. If your pipeline streams the concatenated output, add -movflags +faststart to move the moov atom to the beginning so playback can start before the full download completes.
FAQ
Can I merge MP4 and MOV files with FFmpeg?
Yes. Use the concat filter (not the demuxer). The concat filter re-encodes both inputs into a common format, so different containers and codecs work fine. FFmpeg Micro handles this automatically when you pass multiple URLs in the inputs array with a concat filter.
What's the fastest way to join videos without re-encoding?
The concat demuxer with -c copy is the fastest method. It copies streams directly with zero quality loss. But all inputs must share the same codec, resolution, and frame rate. If they don't match, you'll need the concat filter, which re-encodes.
How do I add a fade transition between concatenated clips?
Use the xfade filter instead of concat. For a 1-second crossfade between two 10-second clips: [0:v][1:v]xfade=transition=fade:duration=1:offset=9[v]. FFmpeg supports over 30 transition types including fade, wipeleft, slidedown, and dissolve.
Does FFmpeg concat work with audio-only files?
Yes. The concat demuxer and filter both work with audio files like MP3, AAC, and WAV. For the filter, use a=1:v=0 to specify audio-only concatenation. The demuxer approach with -c copy works if all audio files share the same codec and sample rate.
How many videos can I concatenate at once?
FFmpeg itself has no hard limit. The concat demuxer handles hundreds of files via the file list. The concat filter gets complex beyond 5-10 inputs because you need to map every stream. FFmpeg Micro's API supports up to 10 inputs per request, which covers most production workflows.
For a deeper dive on FFmpeg filters and encoding fundamentals, check out the Learn FFmpeg course.
If you want to concatenate videos programmatically without managing FFmpeg servers, try FFmpeg Micro for free. Send a JSON request, get a merged video back.
This article was originally published by DEV Community and written by Javid Jamae.
Read original article on DEV Community