Authors: Rekha A G, Mohammed Shahid Abdulla and Asharaf S
Outlier detection is critical for many business organizations. For example, it could translate to fraud detection in financial industry, intrusion detection in cyber security or diagnosis of diseases in medical field. The increasing volume of data demands efficient algorithms to detect outliers in real time with minimal human intervention. Faster detection of outliers in Big Data is one of the biggest challenges and this work tries to address this issue. In this work we have proposed a novel model for porting an outlier detection technique namely, Efficient Lightly Trained Support Vector Data Description (ELT_SVDD), to Hadoop using the MapReduce programming paradigm. We have also shown how computations can be formalized as Map and Reduce operations.
Keywords: Big Data Outlier detection; SVDD; Hadoop; MapReduce