In the field of criminal investigations, it is often necessary to search for evidence on electronic devices. Digital traces can be found on computers, tablets, phones, USB drives, but also on post-it notes, photo cameras and smart devices. The job of a forensic investigator is to secure the information on these appliances, and use it to find legal evidence for a crime.
Normally, such investigators use special hardware to create 1:1 copies of hard drives, and commercial software to analyze them. But did you know that there are free tools that are nearly as good? This is what I will show you in this blog post.
To follow along, you probably do not need any knowledge about the structure of hard drives or how file systems work, but maybe you already heard about FAT and NTFS, two file systems by Microsoft, which we will analyze here.
Before we start, we need some files to work on. For training, we will use two examples of drive images. The first image contains a number of different partitions, and the second image is a raw partition image of a NTFS file system:
1-extend-part.zip [160KB; uncompressed 150MB]
8-jpeg-search.zip [2MB; uncompressed 10MB]
Extract the ZIP files and you should get two files: ext-part-test-2.dd and 8-jpeg-search.dd. This type of files are created with the dd linux tool, which is used to create 1:1 copies of hard drives and partitions. I will rename the first file to image.dd and the second to jpeg.ntfs, as they are easier to type later.
Installing the tools
I will be using The Sleuth Kit (TSK), a collection of command line tools created by Brian Carrier (note: he is not only a developer, but also a lead author of many publications in the field). There are Windows binaries available, so maybe you can follow along even if I will be using Linux.
A better option (and you should know by now that I am a big fan) is using Docker. You can use my preconfigured docker image that is hosted on Docker Hub:
# pull the image docker pull dbof/kali-forensics # Run the container docker run -v "$(pwd)":/home/forensics --rm -t -i dbof/kali-forensics /bin/bash
The above command creates a container instance and mounts the current directory to the container, such that you can add the downloaded images to the folder to access them from inside the container. The Docker image contains all the tools used in this tutorial.
Analysis of the partition table
We will start with the mmls command, which allows to view the partitions contained in the image file. Execute the following command:
The output is as follows:
DOS Partition Table
Offset Sector: 0
Units are in 512-byte sectors
Slot Start End Length Description
000: Meta 0000000000 0000000000 0000000001 Primary Table (#0)
001: ------- 0000000000 0000000062 0000000063 Unallocated
002: 000:000 0000000063 0000052415 0000052353 DOS FAT16 (0x04)
003: 000:001 0000052416 0000104831 0000052416 DOS FAT16 (0x04)
004: 000:002 0000104832 0000157247 0000052416 DOS FAT16 (0x04)
005: Meta 0000157248 0000312479 0000155232 DOS Extended (0x05)
006: Meta 0000157248 0000157248 0000000001 Extended Table (#1)
007: ------- 0000157248 0000157310 0000000063 Unallocated
008: 001:000 0000157311 0000209663 0000052353 DOS FAT16 (0x04)
009: ------- 0000209664 0000209726 0000000063 Unallocated
010: 001:001 0000209727 0000262079 0000052353 DOS FAT16 (0x04)
011: Meta 0000262080 0000312479 0000050400 DOS Extended (0x05)
012: Meta 0000262080 0000262080 0000000001 Extended Table (#2)
013: ------- 0000262080 0000262142 0000000063 Unallocated
014: 002:000 0000262143 0000312479 0000050337 DOS FAT16 (0x06)
This might seem overwhelming at first, but you can see 15 different sections of this drive. There are different types of partitions:
Unallocated: There is no content here, the space is unused and a new partition can be created on top of it.
DOS FAT 16: An msdos partition, which is used by Microsoft operating systems. It is a pretty old format, which you can find on very old hard drives or usb drives.
Primary / Extended: These are not partitions, just metadata information that describes the structure of the hard drive.
So, we have a total of 6 small partitions on this drive. Now, to inspect a partition, we can operate directly on the image file, or we extract the partition and write it to a new file first. For this, we need to look at the numbers in the Start column, which shows the offset of the partition inside the image file.
To extract the first partition, one can use the dd tool, but the Sleuthkit provides a much cooler tool: mmcat
mmcat image.dd 2 > first.fat
This extracts slot 2 of the image to a new file. Why 2? Look at the output of mmls again. The slot column shows a number, which can be used with mmcat, without the need to use offsets. You can extract the other partitions with the same command, but for this blog post, we will only use the first one.
Now, you can look at the file system with fls:
The output contains the file structure of the drive, which should be the following:
r/r 4: primary-1.txt
v/v 831123: $MBR
v/v 831124: $FAT1
v/v 831125: $FAT2
V/V 831126: $OrphanFiles
So, we see 5 entries. Any entry that starts with a $ (dollar) symbol is not a real file, but only a virtual file or folder, which is created by the tool. Therefore, we only have one regular file: primary-1.txt. Unfortunately, the file is empty (you can learn how to check that in the next section), and so is the rest of the partition, so there is nothing else to show here. Therefore, we will continue with the jpeg.ntfs image, which is way more interesting!
Analyzing the second image
Before starting to analyze the second image, jpeg.ntfs, let’s first look at what it is. I can tell you right now that it is a NTFS partition, not a whole drive. NTFS is a format used exclusively by Microsoft Windows for hard drives. If you use Windows and have a C:\ drive, it is probably NTFS.
But you can retrieve this information for yourself. The SleuthKit has the right tool for you:
I will only show the first part of the output, as it is the most relevant for now:
FILE SYSTEM INFORMATION
File System Type: NTFS
Volume Serial Number: 325C284B5C280C63
OEM Name: NTFS
Volume Name: JPEG-SRCH
Version: Windows XP
You can see all the relevant data here, and even the volume name that is displayed in any file explorer. If you try to use mmls here, you fail. Remember that mmls is for inspecting a whole drive (with multiple partitions). So, let’s look into the partition and reveal its contents.
Inspecting the file system
Start with the same command as before, now using the jpeg.ntfs file:
Now you see the following contents:
r/r 4-128-4: $AttrDef
r/r 8-128-2: $BadClus
r/r 8-128-1: $BadClus:$Bad
r/r 6-128-1: $Bitmap
r/r 7-128-1: $Boot
d/d 11-144-4: $Extend
r/r 2-128-1: $LogFile
r/r 0-128-1: $MFT
r/r 1-128-1: $MFTMirr
r/r 9-128-8: $Secure:$SDS
r/r 9-144-11: $Secure:$SDH
r/r 9-144-5: $Secure:$SII
r/r 10-128-1: $UpCase
r/r 3-128-3: $Volume
d/d 27-144-1: alloc
d/d 37-144-1: archive
d/d 30-144-1: del1
d/d 47-144-1: del2
d/d 33-144-1: invalid
d/d 41-144-1: misc
d/d 48-144-1: RECYCLER
d/d 45-144-1: System Volume Information
V/V 52: $OrphanFiles
As I said before, you can ignore the entries starting with a $. Instead, look at the last few files, which are directories. You can see more information about the output in the SleuthKit Wiki.
Now, focus on the second column, which shows some numbers separated by dashes. These are sections inside the partitions which act like coordinates of the specific files. You can use these to select the files. For example, let’s look inside the alloc folder:
fls jpeg.ntfs 27-144-1
The output shows the contents of the folder:
r/r 29-128-3: file1.jpg
r/r 28-128-3: file2.dat
So we found 2 files on the partition, inside the alloc folder. Now we would like to know what is inside file1.jpg, which looks like a picture. Guess what? You can do it with the icat command, by using the section identifier of the file:
icat jpeg.ntfs 29-128-3 > file1.jpg
The command works similar to the cat command and outputs the file content to the console. Therefore, if you do not want to do a mess inside your terminal, save the output to a file, such that we can open it with an image viewer:
Okay, so we extracted a file from the image. How cool is that?! You can try to extract more files from the image (Spoiler: file2.dat also contains a picture).
You can use these techniques to extract any kind of file, even deleted files. To search for all deleted files recursively in the file system, you can again use fls:
fls -rd jpeg.ntfs
Next I will show you another forensic technique called Data Carving, which can be used to find hidden files that are not easily found with the above methods.
Sometimes hard drives are damaged or corrupted, or partition tables are missing, which makes working with the above tools impossible. One can try to repair the image manually or with software, but often the last resort is Data Carving.
So what is Data Carving? Let’s start by thinking about regular files, a JPEG file, for example. The JPEG format contains structures that make it identifiable, such as sequence markers and a recognizable file header. Therefore, if you look at a continuous stream of bytes, one can detect a JPEG picture by searching for specific patterns. This is what file carvers do: They have a database of fingerprints for a number of popular formats (JPEG, PNG, video, MS office files, …) and search for them in a block of bytes. So, if you are looking for pictures, a carving software digs through the hard drive image and tries to extract the picture by looking for the start and estimating (or calculating) the end. Then, it just copies all the bytes to a new file. If you are interested in more information about File Carving, check the Forensics Wiki.
If the file is an actual picture or not is only decided by the user, which has to open the file and inspect its content. Therefore, file carvers often “discover” files that are not actually files, but casual sequences of bytes that look like it might be a picture, or a video.
Extract all images with photorec
For the following section, I will be using photorec, which is my favourite interactive tool for multimedia recovery in Linux. Before we begin, create a new directory called results, which is going to be the destination for all recovered files.
Start photorec by specifying the image file to search:
This start the visual tool, where you can “select a media”. Press Enter to select the above image file and you will see a new menu with the following selection:
PhotoRec 7.0, Data Recovery Utility, April 2015
Christophe GRENIER email@example.com
Disk jpeg.ntfs - 10289 KB / 10048 KiB (RO)
Partition Start End Size in sectors Unknown 0 0 1 4 62 62 20096 [Whole disk]
P NTFS 0 0 1 4 62 62 20096 [JPEG-SRCH]
[ Search ] [Options ] [File Opt] [ Quit ]
Start file recovery
You can navigate left and right to change settings. When you select [ File Opt ], you can choose the file types to look for. The default settings include most file types anyway, so you can ignore this. Select [ Search ] and press Enter to continue to the next screen.
The following screen lets you select the filesystem used. As we know it is NTFS, you can select [ Other ] and press Enter. In the next screen, select [ Whole ] to search the whole partition.
Now select the results folder that we created at the beginning. Press Enter to select the folder, then press c to start the recovery.
In my case, I got the following output, which can differ depending on your settings:
Disk jpeg.ntfs - 10289 KB / 10048 KiB (RO)
Partition Start End Size in sectors
P NTFS 0 0 1 4 62 62 20096 [JPEG-SRCH]
10 files saved in /home/forensics/results/recup_dir directory.
So, 10 files were recovered and saved in our results folder. Press q repeatedly to exit the software and inspect the results. There should be some JPEG files, some archives and a .doc file, containing up to 9 pictures (in my case, picture 8 was not found). If you did get less results, try to tweak the file settings in photorec and try again.
So, let’s recap this (really short) introduction into IT forensics. We used SleuthKit as a software package to analyze hard drive images. Hard drives contain one or multiple partitions, which can be extracted and analyzed more in-depth. Partitions contain file systems which are the way files are structured on a hard drive. File systems include special identifiers for every file which map a file to its position on the drive. These files can be extracted from the image, and sometimes deleted files are not really deleted and can be recovered again. As a last resort, file carvers can systematically search the drive for patterns that might represent actual files, such that they can be retrieved even if the partition structure is damaged.
I hope you were able to follow along and get the same results. I encourage you to try to analyze more images that you can find here, as it can be a lot of fun to discover hidden files like a real forensic investigator!
If the above links do not work, you can download the images from here.