AbstractsComputer Science

A hash based technique for the identification of objectionable imagery

by Daniel Fraser




Institution: AUT University
Department:
Year: 0
Keywords: Objectionable image; Forensics
Record ID: 1301130
Full text PDF: http://hdl.handle.net/10292/7423


Abstract

The Internet has been one of the greatest man made achievements of the 20th century. It has revolutionized the computer and communications world like nothing before. The invention of the telegraph, telephone, radio, and computer set the stage for this unprecedented integration of capabilities (Leiner, et al., 2009, p.22). The Internet has given us the ability to easily communicate globally, access the world’s libraries simultaneously, enabled surgery to be performed remotely, created virtual markets for the sale of goods and services, and allowed media on demand. In contrast, criminals have taken this opportunity to research into new ways of profiting by using the Internet as a platform. Accordingly to the Symantec Cyber Crime Report 2013, the global cost of cybercrime was US$113 Billion (NZ$1.35 Billion) in 2013 (Horbury, 2013). One of the fields that criminals are profiting in is child exploitation. This research aims to provide a better solution at identifying objectionable images on a global scale to reduce the amount of material being so easily distributed. The current problem with identifying objectionable images is that there are an enormous number of images on the internet that require analysing. Forensic Investigators are currently using two techniques for identification. They are Content Based Image Retrieval (CBIR) and Concept Based Image Indexing (CBII). CBIR uses the visual aspects of an image for identification and CBII uses the metadata of the image for identification. The current problem is that CBIR is very accurate, but very slow and CBII is very inaccurate, but very fast in image retrieval. This research attempts to solve the issue by proposing a new hash based technique for the identification of objectionable imagery. A Hash Based Technique for the Identification of Objectionable Imagery. The Hashed Based Image Retrieval (HBIR) technique proposes to be faster than CBIR and more accurate than CBII. A constructionist approach using the design science methodology (DSM) will be used to conduct the research. The DSM is a flexible methodology allowing the researcher to slightly redirect the research while it is being undertaken to create a robust and efficient artefact. Four phases of research were undertaken, the first phase was the software and pilot testing phase. The second phase was HBIR testing using the proposed artefact. The third phase was CBIR testing using Forensic Tool Kit software. The fourth phase was CBII testing using Encase software. The findings illustrated that the CBII technique was the fastest at processing the 10 batches of 10,000 images with processing times ranging between 21 seconds for 50KB batch of images to 407 seconds for the 950KB batch of images. The HBIR technique was slightly slower with processing times ranging me between 446 seconds for 50KB batch of images to 922 seconds for the 950KB batch of images. The CBIR was the slowest technique with processing times ranging me between 2024 seconds for 50KB batch of images to 3188 seconds for the 950KB batch of…