Cyber CSI: Digital Forensics and the Fingerprints You Leave Behind
Cyber CSI: Digital Forensics and the Fingerprints You Leave Behind
In the 1980’s, computers slowly started to become a part of daily life for the general population. By the mid-1990’s, this was accelerated by the rapid adoption of the Internet by people outside the previous tight-knit circles of academia and scientific work.
Both of these factors saw the introduction of computer evidence in criminal cases because criminals, a subset of the population as a whole, were also starting to use computers in their daily lives. Whether those crimes were “computer crimes” or more traditional crimes, the trail of clues used by Law Enforcement and prosecution teams increasingly included digital evidence.
Criminal and Civil Use Cases
These days, electronic devices are a part of so many aspects of our lives, that the once tiny field of “digital forensics” has grown immensely. This growth has been both in personnel and in tools and services that provide capability to those personnel. However, many of the basic principles of the field remain the same.
In this article, we hope to cover a broad range of those principles, as well as get a little deeper “into the weeds” with some specific digital forensics tools at the disposal of forensic professionals today. While there may be a psychological association between “digital forensics” and “computer crime,” the reality is that computer-based evidence on non-computer crimes is plentiful. Whether it’s a photograph that proves possession of a stolen item, an email indicating an intent to commit a crime, or any other number of potential pieces of useful information.
In addition to the application of digital forensics in the prosecution of crimes, digital forensics also has widespread applications in the world of “E-Discovery” within corporate environments. An example of this use would be the examination of a parting employee’s computer, in order to search for violations of non-disclosure agreements, development of projects on company time, company resources that the parting employee maintains ownership of, or even the theft of patents and other intellectual property which the company might wish to contest.
Incident Response Forensics
In cases where a computer system has been breached, the primary goal of those charged with maintaining that system will be to get that system back up and running as soon as possible. Unfortunately, this can sometimes mean that evidence of the actual breach may be overwritten. It’s preferable, though not always possible, to replace a system that has been breached with a system that hasn’t been, or to make a block-by-block copy (dd is your friend) of the server to an external, removable drive before resolving the breach.
Once you’ve got a copy of the entire disk, you can remove the external drive and later analyze it (after you’ve made sure to put it in read-only mode) to see the exact state the intruder left it in, along with any evidence or trail the intruder may have left behind that might help determine their identity, motivation, or other useful information.
We won’t go too far down the incident response forensics rabbit hole now, as that’s an entirely separate discipline and our main goal here is just to cover some of the fundamentals, as well as to provide a glimpse of some of the modern tools and techniques of digital forensics.
Preservation of Evidence
An absolutely key principle of Digital Forensics is the preservation of evidence. As is the case with a physical crime scene, keeping things as the investigator finds them is of paramount importance. With a physical crime scene, an investigator may be looking for footprints of a suspect. If the crime scene is riddled with the footprints of careless officers or investigators, sorting those out in order to find only the footprints of a suspect can make the job of the investigator harder than it needs to be, or worse still, can actually obliterate evidence that existed before the crime scene was secured.
In much the same way, a search for digital evidence on a file system that allows writing by a user or operating system may inadvertently obliterate digital evidence that could prove crucial to the case. Another element to preserving evidence is establishing a chain of custody process. In Law Enforcement environments, this process most likely already exists and you need only learn to follow it.
In corporate environments, it may be less formal, though the benefits of establishing such a process should be obvious; doing so makes sure that the evidence you may produce for use in a court case can be reasonably proven not to have been tampered with at any point.
Types of Data
Active Data is made up of data like spreadsheets, word processing documents, inventories, application and operating system files.
Metadata is “data about data,” meaning file creation dates and times, file editing times, origin data and such.
Operating System Data is data created by the operating system, whether in log form, permissions details, web history data or authentication data (showing for instance, that a given user logged into a system at a given time).
Temporary Files contain data that’s saved by the operating system or by an application, without the user specifically requesting such a save operation. When you open a word processing application and type something out, even if you don’t deliberately save the file you’ve created, the application in question saves changes you make periodically to temporary or “cache” files.
Communications Data is any type of data pertaining to communications of any sort. This could be recorded data from Skype conversations, email, SMS messages, iMessages, Telegram logs or anything of that sort.
Residual Data is data that may have been deleted, but hasn’t actually been removed from the device by means of overwriting the space it used to occupy.
Slack Space is space on a drive that’s been allocated, but not necessarily used by a given file.
Backup Data is data obtained from backup files, whether compressed or uncompressed, that can be pulled out of backup copies and presented as evidence.
Forensics Examination Stages
There are basically four phases or stages of a digital forensics examination; Evaluation, Collection, Analysis and Presentation. However, there are two processes which should really be considered in addition to those four stages. Readiness at the beginning (which is to say “preparation before an incident happens,” to include training and process definition) and Review, as part of a [Presentation > Review] loop. Generally, the end user of the presentation isn’t the same person that performs the evaluation.
In Law Enforcement applications, the digital forensics examiner provides a report to the prosecution team. In commercial applications, the examiner provides reports to legal or human resources teams. In both of these applications, the final consumer of the report should provide feedback to the examiner to either prune out unnecessary information in the report or go back and search for additional information that the end user thinks they may need.
Readiness: The “Readiness Stage” includes tool selection, training and policy implementation to maximize effectivity of any digital forensics examination. The best time to implement a policy to make sure auditing data exists is prior to an incident occurring. In commercial examinations, this includes making sure your corporate systems are logging and maintaining backups of data that may some day be pertinent if something requires a forensic examination.
This can be a policy that’s implemented on potential investigation targets, or a policy that sets network-level policies, like keeping logs of who performs LDAP lookups and from what IP address (or some other unique identifier per each lookup). In Law Enforcement forensic examinations, this would mean that tools to perform the analysis have already been selected and that personnel who would perform the investigations would be trained in the use of these tools, as well as the general principles of such an examination.
Evaluation: The Evaluation Stage is where environmental factors would be considered prior to the collection of evidence. If potential evidence is to be “live acquired,” or acquired “in the field,” the safety of a collection site should be considered ahead of time. In addition, the specifics of what should be collected should be determined, either on a case-by-case basis or as a matter of defining general and specific policy.
Collection: The Collection Stage is where data is “acquired.” Ideally, this is done in a controlled environment, but not all situations are ideal. Say a suspect is taken into custody in their own home and his or her computer or phone is found in an “unlocked state.” Let’s also say that person might be expected to be uncooperative in regard to providing his or her passcode or password. It may be expedient then to do a “live acquisition” of the drive while it’s still in an unlocked state. Barring the use of drive-level or user-level disk encryption, this can likely still be done after such a device is brought back into a controlled environment, but some pieces may not be fully retrievable.
Analysis: The analysis phase varies widely depending on the specifics of the case, but these days the first part of the analysis would be the feeding of drive images into the analysis software as evidence, followed by more refined searches for pertinent data by the forensic examiner.
Presentation: The final product of a digital forensic examination is a report. This is the case in Military/Intelligence operations, Law Enforcement operations and commercial operations. A solid final report will contain actionable or court-usable information, presented in a formal document, with all pertinent data and metadata preserved and ready for the end user.
Review: In Law Enforcement and commercial examinations, it’s important that the end user scrutinize the presentation, looking for any holes that may exist in the information, or weeding out any excess information that may not be necessary. With adjustments made, the Presentation stage happens again, followed by another review, until the final user of the report is satisfied with the contents of that report.
My first experience with digital forensics was in the mid 1990’s, wherein the organization I was employed by suspected that a particular employee had violated the acceptable use policies of the organization, by downloading and viewing pornography on his organization-owned workstation.
The basic process for performing the analysis was to clone the hard drive of the device to a read-only volume, after which I had to manually dig through the log files and subdirectories, searching for suspicious images and traffic. The process was painstaking and time-consuming, relying heavily on my own human speed of analysis. The longer I took investigating each nugget of potential evidence slowed down the overall process and in the end, I spent about three solid weeks combing through this user’s device, documenting a lot of information about any individual violation I was able to find.
These days, analysis software makes that process a lot less labor intensive. The basic process is the same; one first “acquires” a device by creating a block-by-block copy of the media, then makes sure that the copy is read-only, so that running analysis tools on the copy won’t tamper with the evidence contained on that copy. The acquisition tools now are significantly faster and more user-friendly, but what has seen the most improvement is the mass analysis of an acquired device.
Tools like Guidance Software’s Encase Forensic and BlackBag Technologies’ Blacklight allow the forensics examiner to add a device copy as evidence, after which the tools themselves run automated analysis processes in order to document every single file on the device. This includes not only the file data itself, but also the metadata, which contains all sorts of useful information, such as the date the file was created, the last time the file was modified, the original location from which the file was obtained (if it was copied from the Internet), the type of file, the author of the data, GPS location data, image resolution data and even a calculated value for the percentage of an image algorithmically determined to be “flesh”.
In Blacklight, each piece of data is cataloged with a unique hash value to describe that file, which allows the examiner to exclude duplicates of the same file, or search for multiple instances of the file. For Law Enforcement use, there’s even a database component which will compare file hashes to known pieces of evidence in other cases (particularly child pornography cases), thus reducing the amount of evidence of that nature that the examiner would have to personally inspect.
Both Encase Forensic and Blacklight thoroughly analyze the files found (and in OS X analysis, the sqlite databases that often contain information about files that no longer exist). It then categorizes them into different groupings, like “communications” (text messages, instant messaging application logs, email and the like), web browsing histories and caches, images and video files and “productivity” type data (for instance, word processing or spreadsheet data).
This allows the examiner to browse entire communication histories displayed in their preferred format (hex data, strings, or “Chat view,” in the case of text messages), as well as providing quantitative data on overall communications. For instance, in Blacklight you can see who the person communicated with the most, by what method and you can look at all text messages at once, or drill down to all text messages between the user and a specific contact they were in communication with.
In addition to analyzing the entire contents of an acquired device, Blacklight is able to analyze the data still in RAM of Windows devices. Both Blacklight and Forensic are able to perform analysis on iOS or Android devices, as well. Both also have the capability to “carve” files from sections of the drive that may still contain data the user may have deleted; provided the user didn’t overwrite the drive space with random other data.
On top of “carved files,” when analyzing pre-Windows 10 devices, Blacklight will catalog the “Volume Shadow Copies” of files. Even though a user may have deleted a live copy of a file they wanted to obscure, there are often additional copies of those files created by Windows and stored on the device.
Another interesting forensics vector is the analysis of iOS or Android backup files. Say a criminal has an iPad in addition to a Mac and they’ve synced their iPad to their Mac one or more times. Provided they didn’t encrypt the backup, Blacklight would be able to do a deep dive into the backup image created by iTunes and acquire all the data and associated metadata within that backup.
Both software tools can look at when external drives or devices were connected to the device acquired, with extensive details on those devices. If a user copied files that may be evidence onto a USB drive, evidence of that copy being made may be logged, as well as the manufacturer and serial number of the specific drive. Even if they renamed the drive, the specific identity of the drive is still able to be determined through the serial number. This can be hugely advantageous, as it enables crime scene processors to be directed to look for additional devices (USB drives, etc) that should be collected as evidence and analyzed in addition to the main device image itself.
Additionally, device analysis can determine connections or attempted connections to specific wireless networks. This includes not only wireless networks that the user may have intended to connect to, but also any other networks that may have attempted to establish connections with the imaged device.
So What Does This Mean To You?
Unless you’re interested in entering the field of digital forensics, your eyes may have already glazed over at this point in the article. So long as you obey the law and don’t commit crimes, you have nothing to worry about, right?
Wrong. The flaw in this theory is the supposition that only Law Enforcement, Military/Intelligence, or corporate information security people have access to the tools and knowledge described. In truth, a number of people have this knowledge and while they may not have the financial resources to legally purchase the analysis software described here, there are a fair number of free Open Source tools that can perform a number of similar functions.
What you should consider is just how much information about you is stored on your computer and how that information might not be something you want to share with everyone in the world. Every single electronic communication you engage in, every single web page you browse, every single file you create, even if you don’t save it; all of this is likely present on your personal computer. The passwords you store for your electronic banking, any photographs you took of your passport or your credit cards or your drivers license; all of it is on that drive.
Even the things you thought better of keeping and dragged to your trash can; all of that’s probably still there. Every WiFi network you happened to come within radio range of is documented. Every single USB drive you plopped into that USB port, complete with serial numbers and file histories, is there. Every photograph you took on your phone and for that matter, everything else you had on your phone, which you synced with your computer, is sitting there; ripe for the taking, in a .sparsebundle on your drive.
The digital evidence of your life is on that device and is most likely enough to establish your behavioral travel patterns and reconstruct a shockingly robust picture of what your day to day life consists of. All of that is just sitting there on your laptop.
So What You Can Do?
Not only should you make sure you destroy any and all drives in any computers you decide to get rid of, you should also consider just how secure your device is when you leave it behind. Whether you’re just running into the convenience store and it’s sitting in a laptop bag on your passenger seat of your car, or you leave yourself logged in when you leave home for work, figuring that no one is going to break into your house while you’re gone.
There are things you can do to make that data harder to acquire and I’m not providing these suggestions with the hopes of enabling criminal use, but rather to help protect the law-abiding citizen from those with criminal intent. Encrypt your disk. Password protect your firmware. Use two-factor authentication whenever you can. Keep your system up-to-date. Don’t leave your user account password on a note stuck to the bottom of your computer. I’ve said these things before and they’re extremely important steps to take to protect yourself.
Are most criminals going to know how to assemble an entire life profile from the data on the laptop they yanked from your car with little more than some broken shards of spark plug? Probably not. However, it only takes one to significantly impact your day.
Editor-in-Chief’s Note: Matthew Sharp is a Plank Owner and Life Member at ITS and goes by the username “viator.” He lives in The People’s Republic of Northern California and enjoys long range shooting, carrying heavy objects great distances and fuzzy little puppies.