J. Fang*; Z.L. Jiang†; S. Li*; S.-M. Yiu‡; L.C.K. Hui‡; K.-P. Chow‡ * Jinan University, Guangzhou, China
† Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China
‡ The University of Hong Kong, HKSAR, China
Mobile phones have become part of our lives. Because of the potential profit, there exists more than 400 types of Shanzhai mobile phones, which are Chinese pirated brands of mobile phones. There is also a trend in which more and more criminals use these Shanzhai mobile phones to perpetrate crimes as they are much cheaper and easy to obtain. The adverse impact on forensics is the difficulty of obtaining useful evidence from these phones due to the absence of system manuals and knowledge of the memory layout. In this paper, we provide some important information about how the phonebook and phone call records are stored inside a MTK-based Chinese Shanzhai mobile phone, the most popular platform for Shanzhai mobile phones. This information also provides insight for investigators on retrieving deleted phonebook records and phone call history. It may also help investigators to reconstruct and analyze the Timeline of the user’s activity using the location of snapshots.
Digital evidence; Mobile forensics; Timeline analysis; MTK-based Shanzhai phone
This work was partially supported by China State Scholarship Fund(No. 201506785014), National Natural Science Foundation of China (No. 61401176, 61402136, 61361166006), Natural Science Foundation of Guangdong Province (No. 2014A030310205, 2014A030313697), Science and technology projects of Guangdong Province (2016A010101017), Project of Guangdong High Education (YQ2015018), and NSFC/RGC Joint Research Scheme (N_HKU 72913), Hong Kong.
In the last decade, worldwide mobile phone usage has increased dramatically. Globally, the number of mobile cellular subscriptions reached 5.3 billion by the end of 2010, reported by the International Telecommunication Union (ITU). And vendors shipped 371.8 million units in Q1 2011, growing 19.8% year-over-year (IDC) (Wauters, 2011). At the same time, the computational power and storage of mobile phones are getting more and more powerful, especially the deployment of dual-core CPU and gigabytes of internal memory (Lomas, 2011). Due to their mobility and the portability, mobile phones have become second nature to people, and as a result are often quite involved in some criminal cases (Mislan, 2010). More seriously, the powerful mobile phone can be used as a criminal tool anytime and anywhere. In both cases, a lot of digital evidence may be stored inside the mobile phone, and so mobile phone forensics techniques are necessary for retrieving and investigating the information (Barmpatsalou et al., 2013).
Mobile phone forensics has been studied for quite a long time and there are several commercial products for investigating the mobile phones of world leading brands, such as Symbian (Mokhonoana and Olivier, 2007), Android (Vidas et al., 2011), Windows (Klaver, 2010), Blackberry, and iPhone. However, in China, a new category of mobile phone with a commonly known brand of “Shanzhai mobile phone” (Shanzhai phone for short) emerged from 2007 after China’s government removed the license policy to manufacture mobile phones and now it is flooding in the global mobile phone market due to its high cost-performance ratios (Nanyang, 2010). The Chinese word “Shanzhai” originally means “mountain village,” but now it has another meaning to refer to imitation, low-end and unprofessional brands and goods, particularly electronics. Contrast to the remarkable growth of “Shanzhai phone,” there is less published research work related to Shanzhai phone forensics. The reason may lie in the shortage of the technical documents of Shanzhai phone and the great number of Shanzhai phone models. Benefit from the turnkey solution provided by MediaTek (MTK) (http://www.mediatek.com/en/index.php) and Spreadtrum (http://www.Spreadtrum.com) the development period for Shanzhai phone can be shortened from over one year to one month. It means that there will be thousands of models of Shanzhai phones appearing on the market during a single year. Unfortunately, such a quick change becomes a nightmare to researchers to perform digital forensics on Shanzhai phone. Since Shanzhai phones are spreading worldwide and there is trend of criminals using them as they perpetrate crimes, it is necessary to conduct a deeper investigation on Shanzhai phones. As a result, Shanzhai phone forensics unavoidably become more and more important.
In this paper, the method for retrieving data from the internal memory of a typical MTK-based Shanzhai phone is introduced. Data structures of storing call log, phone book, short message service (SMS) and some advanced media content are also parsed via reverse engineering. Furthermore, the extracted information is analyzed with historical information to reconstruct the sequence of operations for help determine a suspect’s activity.
There has been some research on mobile phone forensics since the early 2000s. There is a wide range of mobile forensics tools developed to acquire data from the flash memory of mobile phones (Ayers et al., 2014; McCarthy, 2005). However, most of the tools use commands and response protocols to indirectly access the memory. These commands and protocols depend on the operating system (OS) and actually change the contents of the memory. Only data visible to the OS can be recovered. Also, such tools fail to retrieve data from dead or faulty mobile phones. Another problem with such tools is that they cannot recover deleted data.
Flasher tools are the easiest and noninvasive way to read flash memory data (Breeuwsma et al., 2007), which have been used in quite a few mobile forensics cases (Gratzer and Naccache, 2006; Purdue University, 2007). However, these approaches cannot ensure a complete dump of the memory and may depend a lot on the OS. Meanwhile, if the data connector of the mobile phone is not supported by flasher tools, electronic wiring of the communication pins on mobile phone’s Printed Circuit Board (PCB) may be required for connection with flasher tools.
The physical extraction approach is to physically remove the internal flash memory chip from the mobile phone and read it with a memory reader. This procedure requires professional engineers because memory chips may be damaged during de-soldering. Joint Test Action Group (JTAG) is an embedded test technique to test automatically the functionality and quality of the soldered integrated components on PCB, which is a standard test access port and boundary-scan architecture. It controls the phone’s microprocessor in debug mode to communicate with the memory chip, and dump the memory bit by bit. Therefore, it ensures the completeness of the forensics binary image and it is OS-independent. A brief comparison of these approaches is shown in Table 1.
Table 1
Comparison of Three Internal Memory Acquisition Approaches
Desoldering | JTAG | Flasher Tools | |
Risk of chip damage | High | Low | Low |
Risk of data modification | Low | Low | Medium |
Complexity of usage | High | Medium | Low |
Electronic soldering | High | Medium | Not required in many cases |
Completeness of data | High | Medium | Medium (may not be guaranteed) |
In this paper, we go for the easier solution of using a flasher tool to obtain the memory instead of using JTAG because our focus is more on how the information is stored in the memory. Note that using JTAG should provide a lower level picture of the memory.
From the viewpoint of OS, there have been various forensic software or tools aimed at dedicated OS, such as Symbian (Mokhonoana and Olivier, 2007), Windows mobile (Klaver, 2010), Android (Vidas et al., 2011; Hoog, 2011), and iPhone (Hoog and Strzempka, 2011). A more detailed survey could be found in Barmpatsalou et al.’s review paper (Barmpatsalou et al., 2013). Since these tools are OS-dependent, they cannot be used directly to acquire data from Shanzhai phones. There has been some research work on mobile forensics using JTAG (Willassen, 2005; Zhang, 2010).
In this paper, a typical model of Shanzhai phone is selected to be studied in the experiments. The model is an imitated version of iPhone4. This model is based on one low-end processor of MediaTek, MT6253, which is MediaTek’s first monolithic GSM/GPRS handset chip solution that offers the highest level of integration with lowest power consumption and best-in-class features. Most of Shanzhai phone models were developed on this platform.
Inside the Shanzhai phone, a 16M bytes NOR flash chip (Toshiba TC58FYM7T8C) integrated with a 4M bytes RAM is used to work as read-only memory (ROM) for OS and as nonvolatile random access memory (NVRAM) for data storage. As shown in Fig. 1, the 16M bytes of NOR flash of the Shanzhai mobile phone is divided into two parts. The first 14M bytes (memory address from 0 to 0xDFFFFF) are used to store code and will be kept unchanged after the Shanzhai phone is produced. Noted that this is the default configuration in MTK development solution.
The remaining 2M bytes (memory address from 0xE00000 to 0xFFFFFF) are further divided logically into two areas. As shown in Fig. 2, both of the two areas can be seen as a removable drive under Windows OS when the Shanzhai phone is connected to a computer via USB data cable. Note that this is the only logical distribution of the two areas. Physically, the blocks in flash for the two areas are mixed and not separated as clearly as this figure. From the partition information, we know that both drives are formatted in FAT12 format, but only the drive (here is drive H:) corresponding to USER area can be accessed via Windows, the other one (drive I:) corresponding to SYSTEM area cannot be read, written, or viewed by a normal user. In general, the USER area is kept for the Shanzhai phone user as a storage to exchange data between the phone and a computer, while the SYSTEM area is kept for the OS of the phone as a virtual memory to save the data managed by OS. Note that some of the data saved in SYSTEM area can be viewed or edited by the user via the user interface (UI) of the Shanzhai phone, such as the settings of the phone, phonebook, call log, SMS, etc.
With the help of a flasher tool, the total 16M bytes of raw data in the flash memory can be retrieved as a memory dump and can be further investigated in a computer as a binary file.
In this section, the flash memory dump of the Shanzhai phone is reverse engineered to sort out the format of storing three kinds of baseline contents of mobile phone including the phonebook, call log, and SMS.
In the MTK-based platform, the phonebook, call log, SMS and other system-related user information are organized as data items and stored as files in NVRAM. A data item management system is deployed to manage NVRAM data in the file system and maintains an internal lookup table to retrieve the data items. Fig. 3 shows the logical relationship between the data items and the files in NVRAM.
Usually, phonebook, call log and SMS are saved as data items named “NVRAM_EF_PHB_LID,” “NVRAM_EF_PHB_LN_ENTRY_LID” and “NVRAM_EF_SMSAL_SMS_LID,” respectively, under the subdirectory “NVD_DATA.” While this high-level information can help us to understand the storage mechanism of MTK-based Shanzhai phone, since we cannot directly access to the file system of the phone, we must try to reverse engineer the binary dump to figure out the storing pattern of the data items as following.
Each kind of contents has a different data structure for storage. Phonebook saves the basic information of a contact as one entry, which is 86-byte length including 62 bytes for contact’s name and 20 bytes for contact’s phone number in Binary-Coded Decimal coding. An example is illustrated in Fig. 4. Note that when one entry in the phonebook is deleted by a user, the logical memory space for that entry will be revoked and be filled in with a hexadecimal value of “0xFF.”
Data item of call log saves each call event as one 92-byte length entry, including 32 bytes for caller’s name, 7 bytes for the time of the call, 41 bytes for caller number and 4 bytes for call duration. An example is illustrated in Fig. 5. Note that when one entry in call log is deleted by user, the following call log will move one unit forward to replace the memory space of the deleted call log, and so on. Finally, all the entries later than the deleted entry will move one unit forward as a whole.
The data item of SMS saves each SMS as one 184-byte length entry containing a status byte and an 183-byte protocol data unit (PDU). The status byte is used to indicate the SMS is “new event,” “read,” “sent” or “draft.” And the PDU part including the number of SMS Center (SMSC), the time of receiving the SMS (given by SMSC), the phone number of the sender and the content of the message. An example is illustrated in Fig. 6.
In this section, we first investigate the recovery of content deleted by user’s operations from the flash dump, then we discuss one characteristic of the data management of the MTK-based Shanzhai phone, with which one previous version of the file for storing data item will be kept every time the data item is modified. Based on this property of Shanzhai phone, we propose a timeline analysis method to retrace the suspect’s activity.
In Section 3.2, when one entry in a data item is deleted or modified, or one entry is added to the data item, the memory space for storing data item will be modified accordingly. However, this may be still the understanding at the logical level of NVRAM files. When we go through the flash dump on the binary level, we find that there are multiple copies of data items. Some of them are the previous version of the data item before modification. This may be due to the fact of the flash file system: when the flash store is to be updated, the file system will write a new copy of the changed data to a fresh block, remap the file pointers, then erase the old block later when it has time. So, for example, when a phonebook entry is updated, the pointer to “NVRAM_EF_PHB_LID” will be changed from the gray block in Fig. 7 to the dark block. When a user accesses the phonebook via the UI of the phone’s OS, the newest version of phonebook stored in the dark block will be displayed, while the previous version of the phonebook is still stored in the original flash memory block until the block needs to be recycled. We call the historical version of data items as “snapshot.”
Since the previous data is just “erased” in the filesystem but not be really wiped from the physical storage, the recovery of deleted contents is possible. In our experiments, a tool was developed to automatically parse the flash memory dump to extract all versions of the data item using pattern matching techniques. An example is illustrated in Fig. 8. In this example, nine snapshots were found in the flash memory dump. Snapshot 7 should be the first version of phonebook file as it only contains one entry. Recall that one modification on the data item will generate one more snapshot. Then Snapshot 6 should be the second version as it only has one more operation compared to Snapshot 7 (the operation should be appending “jack1”). Following this logistic and comparing the entries in the snapshots, the operation sequence can be easily deduced as follows: Snapshot 7–Snapshot 6–Snapshot 5–Snapshot 4–Snapshot 3–Snapshot 2–Snapshot 1–Snapshot 9–Snapshot 8.
The above example shows a simple case of rebuilding the operation sequence of the phonebook. Note that the snapshots are continuous; that is, no snapshot is erased by flash recycling mechanism. When the case with some snapshots lost is considered, the situation becomes more complicated and algorithm needs to be applied in the analysis to help rebuild the timeline.
Next, we carry out an experiment with some snapshots lost and propose an algorithm to help analyze the timeline of the user’s activity. First, we manually perform a series of operations on the Shanzhai phone with the following steps:
2. Add entry named “memory1”
3. Add entry named “memory2”
4. Add entry named “memory3”
5. Add entry named “memory4”
6. Delete entry “memory1”
7. Delete entry “memory3”
8. Add entry named “memory5”
9. Add entry named “memory6”
The flash memory is dumped to a computer for investigation after the Step 9 is done. Running our tool on this flash dump, all the information of phonebook could be extracted as shown in Fig. 9. Note that snapshot 6 is the latest version of the phonebook. We define the distance (d) between two Snapshots A and B as the minimum number of operations to change A to B. Since one modification operation on the data item will generate one snapshot, the more similar the snapshots and the more closer the operations in time sequence. For example, in Fig. 9, since Snapshot 1 contains entry “memory0” and Snapshot 2 contains entries {“memory0,” “memory2,” “memory4,” “memory5” }, changing from Snapshot 1 to Snapshot 2 requires three inserting operations, such that the distance between Snapshots 1 and 2 is three. All the distances between any two snapshots in Fig. 9 are calculated and shown in Table 2.
Table 2
The Distances Between Any Two Snapshots (S)
S6 | S5 | S4 | S3 | S2 | S1 | |
S6 | 4 | 2 | 4 | 1 | 4 | |
S5 | 4 | 2 | 2 | 3 | 4 | |
S4 | 2 | 2 | 2 | 1 | 2 | |
S3 | 4 | 2 | 2 | 3 | 2 | |
S2 | 1 | 3 | 1 | 3 | 3 | |
S1 | 4 | 4 | 2 | 2 | 3 |
From Table 2 and the starting point, Snapshot 6, using the shortest path principle, so we can reconstruct the timeline of Snapshot 6, Snapshots 2 and 4, but leave three snapshots that cannot be determined with the shortest path principle. The partial sequence is shown in Fig. 10.
Since Snapshot 5 contains all three entries which also exist in Snapshot 4, Snapshot 5 should be the one nearer to Snapshot 4 than Snapshots 1 and 3. Then the Fig. 10 can be redrawn as Fig. 11 with all the sequences determined.
Thus, the timeline of the operations can be rebuilt with this method. Furthermore, the other kinds of contents stored in the Shanzhai phone also hold this characteristic and the method can be applied to the timeline analysis on the other kinds of contents.
This paper presents work on the investigation of how phone call records and phonebook entries are stored in an MTK-based Shanzhai phone. The investigation reveals some important information on how the system handles the addition/deletion of phonebook entries and the phone call records. Although through the interface of OS, only the most recent entries of phone calls and phonebook are displayed, if the memory has not yet been overwritten, valuable evidence could still be retrieved. Furthermore, a deep analysis will be performed on the extracted information and the historical information to construct the corresponding timeline array to help determine a suspect’s activity.