Big Data

Big Data is an emerging field where innovative technology offers alternatives to resolve the inherent problems that appear when working with huge amounts of data while providing new ways to reuse and extract value from information.
Volume
Big data companies such as EMC can now offer storage solutions such as the 4 petabyte VMAX array. Even this pales in comparison to the announcement in July 2012 that CERN amassed about 200 petabytes of data from the more than 800 trillion collisions looking for the Higgs boson. Companies such as Akamai monitor global Internet conditions around the clock. With this real-time data they identify the global regions with the greatest attack traffic, cities with the slowest Web connections (latency), and geographic areas with the most Web traffic (traffic density). Meanwhile in the financial services domain, the world’s financial markets and financial services companies generate and store petabytes of data on a daily basis. However financial services companies struggle to structure, query, analyse and act on this data mainly as it is stored in disparate data sets. In turn visualisations such as that below cannot be accurately produced due to disparate and unstructured data 
Variety
Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. New insights are found when analysing these data types together. Super Stream Collider (SSC) is a platform, which provides a web-based interface and tools for building sophisticated mash-ups combining semantically annotated Linked Stream and Linked Data sources into easy to use resources for applications. The system includes drag & drop construction tools along with a visual SPARQL/CQELS editor and visualization tools for novice users while supporting full access and control for expert users at the same time. Tied in with this development platform is a cloud deployment architecture which enables the user to deploy the generated mash-ups into a cloud, thus supporting both the design and deployment of stream-based web applications in a very simple and intuitive way.
Velocity
For time-sensitive processes such as catching fraud, big data must be used as it streams into the enterprise in order to maximize its value. High Frequency Trading (HFT) is one of the most time sensitive data applications in financial services. A class of high-frequency trading strategy relies exclusively on ultra-low latency direct market access technology. In these strategies, computer scientists rely on speed to gain minuscule advantages in arbitraging price discrepancies in some particular security that is trading simultaneously on disparate markets. A challenge to be addressed is to bring a common approach to these bespoke solutions and find a way to report these data in a visual manner in sub minute or sub second timeframes.  

The FIORD project proposes the development of the Big GRC Data concept and framework to advance the state of the art for financial services Big Data. This will be a semantic ontology based Framework.
 Big Data Topic Advances by FIORD
Volume: Petabytes of data can be amassed in large storage arrays.Structure the data through the use of financial service industry standard ontologies so that it can be used in real-time by market participants and financial regulators.
Variety: Solutions like the Super Stream Collider (SSC) can combine multiple data streams with large amounts of changing data.Combine, query and analyse massive, rapidly changing datasets from multiple financial services trading exchanges and facilities.
Velocity: Solutions like the Super Stream Collider (SSC) can combine multiple data streams with large amounts of changing data.Structure massive datasets that change rapidly like High Frequency Data with more static data like legal entity identifiers using federated databases.

Case Study: SEC Requires Consolidated Audit Trail to Monitor and Analyse Trading Activity
In July 2012 the Securities and Exchange Commission voted to require the national securities exchanges and the Financial Industry Regulatory Authority (FINRA) to establish a market-wide consolidated audit trail that will significantly enhance regulators’ ability to monitor and analyse trading activity. 

The new rule adopted by the Commission requires the exchanges and FINRA to jointly submit a comprehensive plan detailing how they would develop, implement, and maintain a consolidated audit trail that must collect and accurately identify every order, cancellation, modification, and trade execution for all exchange-listed equities and equity options across all U.S. markets. 

Currently, there is no single database of comprehensive and readily accessible data regarding orders and executions. Each self regulatory organisation (SRO) instead uses its own separate audit trail system to track information relating to orders in its respective markets. Existing audit trail requirements vary significantly among markets, which means that regulators must obtain and merge together large volumes of disparate data from different entities when analysing market activity.A consolidated audit trail will increase the data available to regulators investigating illegal activities such as insider trading and market manipulation, and it will significantly improve the ability to reconstruct broad-based market events in an accurate and timely manner. 

A consolidated audit trail also will significantly increase the ability of regulators to monitor overall market structure and assess how SEC rules are affecting the markets, and will reduce the regulatory data production burdens on SROs and broker-dealers by reducing the number of ad hoc requests from regulators presently. Such a system would have to scale to reflect the large amounts of data that may enter the system during times of market stress when volumes are high and detecting illegal activities becomes more challenging. 

The reaction of market participants is important – the data must be available in an agreed timescale to allow regulators to complete their analysis. A complete and robust system is needed; thus interaction with other initiatives such as the Legal Entity Identifier programme will be required to provide aggregated data for market participants.