The portion of an assets return that is not explained by exposure to a benchmark (S&P) is called alpha. The signals that aim to produce such uncorrelated returns are called alpha factors.

Data: Sources and Techniques

Data has always been an essential driver of trading. Traders havae long made efforts to gain an advantage from access to superior info.

The primary source of market data is the orderbook, which updates in real time throughout the day to reflect all trading activity. In the US, stock markets provide quotes in three tiers that offer increaslingly granular info:

The trading activity is reflected in numerous messages about orders sent by market participants. These messages typically conform to the electronic Financial Information eXchange (FIX) communications protocol for real-time exchange of securities transactions and market data or a native exchange protocol.

The NASDAQ TotalView-ITCH data feed

While FIX has a dominant large market share, exchanges also offer native protocols. The Nasdaq offers a TotalView ITCH direct data-feed protocol that allows subscribers to track individual orders for equity instruments from placement to execution or cancellation.

In pandas, the HDFStore is a class that allows you to interact with the Hierarchical Data Format (HDF) file format. HDF is a file format designed to store and organize large amounts of data in a hierarchical structure. The HDFStore class in pandas provides a way to read from and write to HDF files using the HDF5 file format.

The HDFStore is a convenient way to store and retrieve pandas data in a binary format, making it efficient for working with large datasets. It supports compression and provides a hierarchical structure for organizing data within the file. Keep in mind that you'll need to have the pytables library installed, as it is used under the hood for handling HDF5 files in pandas.