File Carving
File carving extracts files by recognizing them, not by reading metadata. Every JPEG starts with the bytes FF D8 FF; every PDF starts with %PDF. Carvers scan disks looking for these signatures, extract whatever follows, and reconstruct files even when the file system is destroyed. The trade-off: file content yes, file names and folders no.
Hawk Eye · Apriorit · Salvation Data
480+ formats supported
2026 tool landscape
File carving is a data recovery technique that extracts files by scanning storage media for known file format signatures (the binary patterns at the start and end of files) rather than reading the file system’s metadata. It works when file system metadata is damaged, missing, or unrecognized; it works after formatting; it works on RAW partitions. The trade-off is significant: file carving generally recovers file content but loses file names, folder structure, original timestamps, and any data not stored within the file’s binary format. PhotoRec is the gold-standard free tool, supporting over 480 file format signatures.
How File Carving Works
File carving exploits a property that almost every file format shares: the file’s content begins with a recognizable binary pattern. The first few bytes of a JPEG, PDF, ZIP, MP4, DOCX, or PNG file follow predictable formats specified by the file format’s standard. These patterns are called magic numbers, file signatures, or just headers. When the file system is destroyed but the file’s content still sits on disk, those signatures are still there for any program that knows where to look.1
The key insight is what carving deliberately ignores. Traditional file system recovery reads the file system’s metadata (the NTFS MFT, the FAT directory table, the APFS Object Map, the ext4 inode table) to find files. File carving skips the metadata entirely and reads the raw bytes of the disk. When the metadata is intact, file system recovery is faster and preserves more information. When the metadata is destroyed, carving is the only option that still works.
A simple JPEG example
Consider a deleted JPEG photo on a hard drive. The file system has marked the photo’s clusters as free, possibly written new files over some of them, possibly damaged the file system structure entirely. The original photo data may or may not still be intact in those clusters. A file carver scans the disk byte-by-byte (or block-by-block) looking for the JPEG signature FF D8 FF at the start of any cluster. When it finds that pattern, it knows a JPEG starts there. It reads forward, parsing the JPEG’s internal structure to determine where the file ends, and writes the result to a recovery output folder. The carver doesn’t know what the photo was originally called or which folder it was in; it just knows it’s a JPEG and it’s recoverable.2
Why this works after a format
Quick formatting (the default in Windows and most operating systems) only rewrites the file system’s metadata structures (the boot sector, MFT, allocation tables); it doesn’t touch the file content clusters. The day after a quick format, virtually all the original file content is still on disk in the same physical clusters where it was stored before. The file system thinks the disk is empty and starts allocating those clusters to new files, but until that allocation actually happens, the original data is recoverable. File carving works as well as it would have before the format, because it doesn’t depend on the file system structures the format wiped.
The recovery industry context
File carving has been a standard recovery technique since the early 2000s when forensic investigators needed reliable ways to extract evidence from damaged drives. The US Air Force Office of Special Investigations released Foremost as one of the first widely-used file carvers, and Christophe Grenier’s PhotoRec (released 2002) became the gold-standard free tool with broad file format support. Commercial recovery software (R-Studio, EaseUS Data Recovery, Disk Drill, Recoverit) all include file carving as a fallback “deep scan” mode that activates when normal file system parsing returns no results.3
File Signatures and the Carving Algorithm
The technical core of file carving is the file signature database and the algorithm that uses it. The implementation details vary across tools, but the underlying mechanics are universal.4
Common file signatures
Every common file format has a known starting pattern. A few examples:
| Format | Header (hex) | Header (ASCII) | Footer |
|---|---|---|---|
| JPEG | FF D8 FF | (non-printable) | FF D9 |
| PNG | 89 50 4E 47 | .PNG | 49 45 4E 44 (IEND) |
25 50 44 46 | %PDF | %%EOF | |
| ZIP / DOCX / XLSX | 50 4B 03 04 | PK.. | Central directory record |
| MP4 / MOV | ?? ?? ?? ?? 66 74 79 70 | ....ftyp | (size in header) |
| GIF | 47 49 46 38 | GIF8 | 3B |
| BMP | 42 4D | BM | (size in header) |
| RAR | 52 61 72 21 1A 07 | Rar!.. | End-of-archive marker |
| MP3 | FF FB or 49 44 33 | (or ID3) | (no fixed footer) |
| DOC (legacy) | D0 CF 11 E0 A1 B1 1A E1 | (non-printable) | (no fixed footer) |
Header-footer carving
The simplest carving algorithm: scan the disk for known headers, then scan forward for the matching footer, and treat everything between them as the file. Works well for formats with reliable footer patterns (JPEG ending in FF D9, PDF ending in %%EOF). Fails for formats without a footer (MP3, raw video) or where the footer pattern can appear inside the file content (some text-based formats).5
Header plus size carving
Many file formats include a size field in or near the header. MP4 files start with a 4-byte size value; BMP files contain the file size in bytes 2-5. Carvers that parse these size fields can extract files of exactly the right length without needing a footer match, which avoids the false-positive problem of matching footer-like patterns inside the file content.
Content-based carving
For formats without reliable size or footer information, carvers use content-based heuristics: parse internal structure (JPEG segments, PDF objects), validate consistency, and cut off when consistency breaks. PhotoRec’s “paranoid” mode applies content-based heuristics to attempt fragmented file reassembly; the trade-off is significantly slower scans and more false positives.
The PhotoRec block-by-block algorithm
PhotoRec specifically uses an interesting optimization. It first tries to determine the file system’s cluster size from the boot sector or superblock; if the file system is intact enough to read this, PhotoRec knows that files start at cluster boundaries and skips bytes in between. If the file system is too damaged, PhotoRec reads sector by sector. When it finds a header signature, it identifies the file type and starts saving data; when it finds a new header for any file type, it considers the previous file complete. The algorithm is read-only by design; PhotoRec never writes to the source media.6
Real PhotoRec usage
To run PhotoRec against a disk image (the recommended workflow):
The default PhotoRec output names files f00000001.jpg, f00000002.jpg, etc. Original file names are gone unless the recovered file embeds them internally (DOCX files include the filename in their internal XML; JPEG files generally don’t).
File Carving vs File System Recovery
The two recovery techniques are complementary, not competitive. The right tool depends on what the file system looks like.7
| Aspect | File System Recovery | File Carving |
|---|---|---|
| Reads | File system metadata (MFT, FAT, inodes) | Raw disk bytes |
| Recovers file names | Yes (when intact) | No (only embedded names) |
| Recovers folder structure | Yes | No |
| Recovers timestamps | Yes | No (only embedded ones) |
| Works with damaged file system | No | Yes |
| Works after format | Limited | Yes |
| Recovers fragmented files | Yes (FS knows fragments) | Limited |
| False positives | Rare | Common |
| Speed | Faster (reads metadata) | Slower (scans full disk) |
| Output organization | Folders mirror originals | Flat folder by file type |
When file system recovery is the right choice
File system parsing should always be the first attempt because it preserves more useful information:
- Recently deleted files on a working drive. The file system’s catalog still has the deleted entries; tools recover names and folder structure.
- Corrupted partition table with intact file systems inside. TestDisk rebuilds the partition table by scanning for file system signatures.
- RAW external drive after improper disconnect. Often a partition table issue with the file system mostly intact.
- Ransomware or malware corruption that damaged metadata but not file content. Recovery tools that parse partial file system metadata work well.
When file carving is the right choice
Carving is the fallback when file system recovery has failed:
- After running file system recovery and finding no useful results. If TestDisk and R-Studio both report empty or scrambled results, the file system is too damaged to parse and carving is the only option.
- Drives that have been completely reformatted (full format, not quick format). Full format wipes both the file system metadata and may overwrite file content; carving recovers whatever content survived.
- Severely corrupted drives with bad sectors damaging metadata sectors specifically. The data sectors may be intact even when metadata sectors aren’t.
- Unknown or unsupported file systems. Carving doesn’t depend on knowing the file system format, so it works on proprietary systems, old systems, and damaged systems alike.
- When you only need file content, not original organization. For example, recovering vacation photos where you don’t need the original folder structure as long as the photos come back.
The “deep scan” feature in commercial software
Commercial recovery software (Disk Drill, EaseUS Data Recovery, Wondershare Recoverit, R-Studio) typically offers two scan modes: a quick scan that reads file system metadata, and a deep scan that takes much longer. The deep scan is usually file carving under a different name; it’s the tool falling back to signature-based recovery when metadata-based recovery doesn’t find what the user is looking for. The deep scan’s slower speed and larger result count are the carving algorithm’s signature characteristics.8
File Carving Tools
The carving tool landscape ranges from free open-source utilities to expensive forensic suites. The right choice depends on the recovery goal: pure data recovery, forensic investigation with chain-of-custody, or specific file format targeting.9
Free and open-source tools
- PhotoRec (Linux / Windows / Mac, GPL). The gold standard. Recognizes 480+ file format signatures across 300+ file families. Read-only by design; cannot damage source media. Companion to TestDisk; both maintained by Christophe Grenier and the CGSecurity project. Awkward text UI but mature, reliable, and free.
- Foremost (Linux, public domain). The original file carver, developed by the US Air Force Office of Special Investigations. Command-line only. Works on disk images and raw devices. Configuration file lets users customize which file types to carve and add custom signatures.
- Scalpel (Linux, GPL). A fork of Foremost optimized for performance. Reads the same configuration file format as Foremost. Better memory usage and speed for large media; same general capability.
- Bulk Extractor (Linux / Windows, free). Specialized for forensic data extraction (emails, URLs, credit card numbers, Wi-Fi MAC addresses) rather than full file recovery. Useful for forensic investigations alongside traditional carving.
- Magic Rescue (Linux, free). Older tool focused on extensible signature recipes for unusual file formats.
Commercial recovery software with carving
Most consumer-facing recovery software includes file carving as a “deep scan” or “advanced scan” mode:
- EaseUS Data Recovery Wizard. Polished GUI, good preview features, file carving as a deep scan fallback. Free version recovers up to 2 GB.
- Disk Drill. Strong cross-platform support (Windows / Mac), 400+ file signatures in deep scan mode, 500 MB free recovery.
- Wondershare Recoverit. Wide file format support, video preview features, deep scan mode for damaged file systems.
- R-Studio. Professional-grade with stronger fragment reassembly than consumer tools; widely used in professional recovery labs.
- UFS Explorer ($85-$2,500). Strong file carving with file system-aware fragment handling.
Forensic-grade tools
Forensic suites add chain-of-custody documentation, more sophisticated fragment reassembly, and integration with broader investigation workflows:
- EnCase ($3,000+). The dominant commercial forensic suite; widely used in law enforcement and legal investigations. File carving is one of many capabilities.
- X-Ways Forensics ($1,500+). Professional forensic tool with deep file carving capabilities; highly regarded in the digital forensics community for its technical depth.
- FTK (Forensic Toolkit) by Exterro ($3,000+). Another mainstream commercial forensic platform; strong for processing very large data sets.
- Autopsy (free, open-source). Free forensic platform that includes PhotoRec and other carvers; popular as a free alternative to commercial suites for non-critical investigations.
Choosing the right tool
For most consumer recovery scenarios, the right path is:
- First, try file system recovery with EaseUS, Disk Drill, or R-Studio in normal scan mode. If files come back with names and folders, you’re done.
- If normal scan returns nothing useful, try the same tool’s deep scan mode. This activates the tool’s file carving fallback.
- If commercial software’s deep scan fails too, run PhotoRec. PhotoRec has the broadest file signature database and often finds files that other tools miss. It’s free and read-only.
- If you need forensic documentation (chain of custody, audit trail), use Autopsy as the free option or commercial forensic tools for legal contexts.
Limitations of File Carving
File carving is a powerful last-resort technique but has fundamental limitations that users should understand before depending on it.10
Lost metadata
The most consequential limitation. File names, folder structure, creation timestamps, modification timestamps, file owner, and access control information are all stored in the file system metadata that carving deliberately ignores. Carved files come out with auto-generated names like f00000001.jpg; folder organization is gone; timestamps reflect the recovery time, not the original.
Some metadata survives because it’s embedded inside the file format itself. JPEG files include EXIF metadata (camera model, capture date, GPS coordinates if enabled); MP3 files include ID3 tags (artist, album, track); DOCX and PDF files include document properties (author, creation date, title). Carving preserves all of this. But if the original file format doesn’t embed metadata internally (raw text files, plain audio, basic images), the carved version lacks it.
Fragmented files
The fundamental algorithmic limitation. Basic carving assumes files are stored as contiguous blocks of bytes; it reads from a header signature forward until it hits an end signature or maximum size. When the file is fragmented across multiple non-contiguous regions of the disk, the carver gets confused: it reads the first fragment correctly, then continues into whatever data happens to be in the next physical sectors (which is some other file’s content), producing a corrupted output.11
Research on real-world drives shows that 6% of all files are typically fragmented, but the rate is much higher for forensically-interesting types: documents, archives, video files, mailbox files. Advanced carvers attempt fragment reassembly through heuristics (PhotoRec’s paranoid mode, X-Ways’s fragment handling), but reliable fragment reassembly is unsolved in general; success depends on file format, fragmentation pattern, and adjacent file content.
False positives
Random data on a disk can coincidentally match file signature patterns. JPEG signatures (FF D8 FF) sometimes appear inside other files’ content, in compressed data, or in random unallocated bytes. Carvers extract these false positives and write them as recovered files; the output is unusable. Modern carvers use additional consistency checks to reduce false positives (parsing the JPEG headers for sane values, validating ZIP central directory entries), but false positives remain a real issue, especially in deep scan modes.
SSDs and TRIM
The TRIM command on SSDs (see the upcoming TRIM Command entry) tells the SSD that deleted clusters are no longer in use, and the controller often zeros out those clusters during background garbage collection. When carving runs against the SSD afterward, the deleted file’s clusters are full of zeros, not the original content. The signatures are gone; carving finds nothing where the deleted file used to be. SSDs without TRIM (older drives, drives with TRIM disabled) are still carvable.
Encrypted volumes
File carving operates on raw bytes, so it can’t read inside encrypted volumes. BitLocker, FileVault (see APFS), LUKS, and modern SSD hardware encryption all encrypt the file content before storage. Without the encryption key, carving sees only random-looking encrypted bytes and finds no signatures. Decryption must happen first; only then can carving extract files from the decrypted volume.
The JPG soup problem
For forensic investigators and recovery customers alike, a common practical issue: carving a 4 TB drive can produce hundreds of thousands of recovered files with auto-generated names, no organization, and a mix of useful files, false positives, and old deleted files the user doesn’t actually want. Sorting through the result manually is enormously time-consuming. The standard approach is to filter by file size (very small files are usually false positives), file type (carve only JPG and PDF if those are what’s needed), and validation (open each file to check it’s not corrupted). Even with filtering, post-recovery organization remains the carving workflow’s weakest point.
Strengths and limitations summary
When carving wins
- Works when file system parsing fails completely
- Works after format on most non-overwritten data
- Works on unknown or proprietary file systems
- Read-only operation; cannot damage source media
- Free tools cover most consumer scenarios well
Where carving struggles
- Loses file names, folders, and timestamps
- Cannot reliably handle fragmented files
- Produces false positives requiring manual filtering
- Defeated by TRIM on modern SSDs
- Defeated by encryption without the key
File carving is the recovery technique that saves cases where everything else has failed. When TestDisk can’t rebuild the partition table, when R-Studio finds no MFT records, when the file system is so damaged that no traditional tool recognizes it, file carving still works as long as the file content sectors haven’t been overwritten. The trade-off (no names, no folders) matters less than recovering the data at all. For users facing total file system corruption after a botched format, after ransomware, after physical media damage that hit metadata sectors specifically, carving is often the difference between full recovery and total loss.12
The recovery workflow places carving at a specific point in the sequence. Image first with ddrescue or HDD Raw Copy Tool; carving runs against the image, never the original drive. Try file system recovery first (TestDisk, EaseUS, R-Studio in normal mode) because file names and folder organization are valuable. Try the same tools’ deep scan modes (which are usually carving) when normal scans return nothing. Run PhotoRec as the open-source ground-truth check; if PhotoRec finds files that commercial tools missed, you’ve found a tool capability gap. Manually filter and verify the results; the carving output usually contains noise that needs sorting before it’s useful. The mantra: file system recovery for organized output, file carving for everything else, image-first for both.
The single most important rule for carving success: act before the deleted clusters are overwritten. The carving algorithm only finds what’s still on disk; once new files have been written over the deleted file’s clusters, carving finds nothing useful. On HDDs with TRIM disabled, the deleted data persists indefinitely until something writes over it (which on lightly-used drives may take months). On modern SSDs with TRIM enabled, the data may be zeroed within minutes of deletion. Stop using the affected drive immediately, image it with bad-sector-tolerant tools, and run carving against the image. For drives with both file system damage and bad sectors, the lab-grade tools (PC-3000, DeepSpar Disk Imager) image at the firmware level and feed the result into PhotoRec or a forensic carver for the recovery itself. The combination handles even the hardest cases that consumer-grade tools can’t reach.
File Carving FAQ
File carving is a data recovery technique that extracts files by scanning storage media for known file format signatures (the binary patterns at the start and end of files) rather than reading the file system’s metadata. It works when file system metadata is damaged, missing, or unrecognized; it works after formatting; it works on RAW partitions. The trade-off: file carving recovers file content but typically loses file names, folder structure, and original timestamps because that information is stored in the file system metadata that carving bypasses. PhotoRec is the gold-standard free file carving tool, recognizing over 480 file format signatures.
File carving is the right tool when traditional file system parsing has failed. Specifically: after a deep file system corruption that tools like TestDisk and R-Studio can’t repair; after a complete reformat that overwrote the original file system structures; on a RAW partition where the file system is unrecognizable; on physically damaged media where metadata sectors specifically were destroyed; when you only need file content and don’t need file names or folder organization. Traditional file system recovery (R-Studio, EaseUS, Disk Drill in normal scan mode) is the right choice when those tools work, because they preserve file names and folders. Carving is the fallback when they don’t.
File names, folder structure, creation dates, and modification timestamps are all stored in the file system’s metadata structures, separate from the file content itself. NTFS keeps this in the MFT, FAT32 in the directory table, ext4 in inodes, APFS in the Object Map. File carving deliberately ignores all of this and reads the raw bytes of the disk looking for known file format signatures (the binary patterns inside the file content itself). When carving recovers a JPEG, it gets the image data, but it doesn’t know what the photo was originally called or which folder it lived in. Some file formats embed metadata internally (EXIF in photos, ID3 in MP3s, document properties in DOCX) that carving can preserve, but file system-level metadata is permanently lost.
Generally no, with limited exceptions. Basic file carvers assume that files are stored as contiguous blocks of bytes on disk. Modern file systems try to store files contiguously, but fragmentation happens regularly, especially on heavily-used drives, when free space becomes scarce. When a file is fragmented, the carver finds the file’s header but the next physical sectors hold a different file’s data; the result is a corrupted recovery. Advanced carvers use heuristics to attempt fragment reassembly: PhotoRec has a ‘paranoid’ bruteforce mode for fragmented JPEGs; commercial tools like X-Ways and EnCase have more sophisticated fragment reassembly. None work reliably for severely fragmented files, especially when the file system metadata that would have described the fragmentation is also gone.
PhotoRec is the gold-standard free file carving tool and works for the vast majority of consumer recovery scenarios. It supports 480+ file format signatures, runs on Windows, Mac, and Linux, and is read-only by design (cannot damage the source media). The text-based interface is awkward but functional. Commercial tools like X-Ways Forensics and EnCase have advantages in forensic contexts: detailed reporting, chain-of-custody documentation, more sophisticated fragment reassembly, and integration with broader investigation workflows. For pure recovery without forensic requirements, PhotoRec produces equivalent or better results than commercial tools at zero cost. Foremost and Scalpel are similar free alternatives with command-line interfaces and configurable signature databases.
It can, but TRIM significantly limits the success rate. When the OS deletes a file on an SSD, the TRIM command tells the drive that the file’s clusters are no longer in use, and the drive’s controller often zeros out those clusters during background garbage collection. When carving runs against the SSD later, the deleted file’s clusters are full of zeros, not the original file content. Carving still works on SSDs for files that haven’t been deleted (where the data is still in active flash cells) and on SSDs with TRIM disabled. For deleted files on TRIM-enabled SSDs, recovery odds drop within minutes of deletion. The encryption layer that most modern SSDs use also prevents carving from reading the underlying flash data without the encryption key.
Related glossary entries
- Data Recovery: the umbrella concept; carving is one of several recovery techniques.
- Disk Image: image first then carve; the universal recovery rule.
- Sector: carving operates at sector boundaries; the foundational unit.
- Bad Sectors: damaged metadata sectors are a common reason carving becomes necessary.
- TRIM Command: the SSD feature that defeats carving by zeroing deleted clusters.
- NTFS: file system whose MFT carving deliberately ignores.
- Best data recovery software: software roundup including tools with file carving capabilities.
Sources
- Apriorit: How to Recover Lost or Deleted Files with Data Carving (accessed April 2026)
- CGSecurity: PhotoRec: Digital Picture and File Recovery
- Hawk Eye Forensic: File Carving in Data Recovery
- Salvation Data: File Carving vs Metadata Recovery
- Infosec Institute: File carving
- photorec.cc: PhotoRec Features
- Belkasoft: Carving and its Implementations in Digital Forensics
- Cyber Forensics Academy: Top File Carving Tools for Data Recovery
- datarecovery.com: Data Carving: Data Recovery Techniques Explained
- photorec.cc FAQ: PhotoRec FAQ
- Laurenson, T.: Performance Analysis of File Carving Tools (2017 research; 6% file fragmentation rate)
- CGSecurity: TestDisk Documentation
About the Authors
Data Recovery Fix earns revenue through affiliate links on some product recommendations. This does not influence our reference content. Glossary entries are written and reviewed independently based on documented research, vendor documentation, independent testing, and recovery-engineer review. If anything on this page looks inaccurate, outdated, or worth revisiting, please reach out at contact@datarecoveryfix.com and we’ll review it promptly.
