Big Data: Finding Frequencies of Faulty Multimedia Data

Document Type

Conference Proceeding

Publication Date



In many health care domains, big data has arrived. How to manage and use big data better has become the focus of all walks of life. Many data sources provide the repeated fault data - the repeated fault data forming the delay of processing time and storage capacity. Big data includes properties like volume, velocity, variety, variability, value, complexity, and performance put forward more challenges. Most healthcare domains face the problem of testing for structured and unstructured data validation in big data. It provides low-quality data and delays in response. In testing process is delay and not provide the correct response. In Proposed, pre-testing and post-testing are used for big data testing. In pre-testing, classify fault data from different data sources. After Classification to group big data using SVM algorithms such as Text, Image, Audio, and Video file. In post-testing, to implement the pre-processing, remove the zero file size, unrelated file extension, and de-duplication after pre-processing to implement the Map-reduce algorithm to find out the big data efficiently. This process reduces the pre-processing time, reduces the server energy, and increases the processing time. To remove the fault data before pre-processing means to increase the processing time and data storage.

Publication Title

ACM International Conference Proceeding Series

First Page Number


Last Page Number




This document is currently not available here.