Stock market firms are transforming organically to handle the big data over years. This article discusses the key transformations that stock market firms are undergoing to handle big data along with the use of big data technologies in capital markets.
First the market data was downloaded from NASDAQ & NYSE using a Java API. The data was streamed real-time. The unstructured stock data from various sources including social media and news networks was also aggregated, structured and then processed to make sense out of it.
Solr Indexing and HDFS Integration:
The unstructured stock data was sent to Apache Solr for indexing. This is a highly capable open source search technology which makes it easy for organizations to enhance the speed of data access dramatically. With the 4.x line of Lucene and Solr, it’s easier than ever to add scalable search capabilities to your data-driven applications. Also, Solr has got inbuilt support for writing and reading its own index and transaction log files to the HDFS distributed file system. This does not use Hadoop Map-Reduce to process Solr data; rather it only uses the HDFS file system for index and transaction log file storage.
The Hadoop Distributed File System (HDFS) is designed to run on commodity hardware. The structured stock data was also stored in HDFS
Banana is an open source dashboard and works with all kinds of time series (and non-time series) data stored in Apache Solr. The goal was to create a rich and flexible UI, enabling users to rapidly develop end-to-end applications that leverage the power of Apache Solr. Data can be ingested into Solr through a variety of ways like flume et al.
Big Data technology implementation comes with its own cost. The 3 V’s of data (variety, volume & velocity) plays an extremely important role in the stock market. The Firms, who are still thinking of investing in Big Data technologies, should gear up soon before it becomes too late to remain competitive.