Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Pair Trading Study Using Linear Regression #6

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

chraberturas
Copy link
Contributor

@chraberturas chraberturas commented Feb 12, 2024

Overview

This pull request addresses the implementation of a basic study on pair trading using linear regression to identify and evaluate potential trading pairs.

Solves: /issues/2

Changes Made

  • Linear Regression Module: Introduced a new module linear_regression.q that implements linear regression analysis on historical price data to identify correlated stock pairs.
  • Pair Selection Process: Developed a process for selecting stock pairs based on their historical price correlations, applying linear regression to calculate the strength and significance of these correlations.
  • Backtesting Framework: Enhanced the existing backtesting framework to incorporate the selection of pairs through linear regression analysis and to evaluate the strategy's historical performance.

@chraberturas chraberturas added the enhancement New feature or request label Feb 12, 2024
@chraberturas chraberturas linked an issue Feb 12, 2024 that may be closed by this pull request
2 tasks
@chraberturas chraberturas changed the title Implementing Kalman Filter for Pair Trading Strategy Enhancement Basic Pair Trading Study Using Linear Regression Feb 12, 2024
linear_regression.q Show resolved Hide resolved
linear_regression.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
@chraberturas
Copy link
Contributor Author

chraberturas commented Feb 12, 2024

As a general comment, we should upload every file and replace absolute paths with local paths when reading data and importing functions. I've just uploaded data files to main branch so please try to replace absolute paths.

I've uploaded historical data as well, so we should fit Linear Regression with that data.

@Kokechacho
Copy link
Contributor

Ok I have uploaded all the changes, there are some details to fix, I have marked them down via comments in Caps.
But now It works just fine.

Copy link
Contributor Author

@chraberturas chraberturas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work here, just a few minor comments. Also, could you please reset my doc comments? :)

streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated
// We calculate spreads for linear regression
// WE SHOULD IMPLEMENT HERE RATIO OF RETURN SO WE CAN CALCULATE EWMA
s: priceY[.streamPair.i][`bid] - ((priceX[.streamPair.i][`bid] * betaF[px;py])-alphaF[px;py]); // I NEED TO CALCULATE BETA AND ALPHA AGAIN I THINK IT HAS TO DO WITH THE LOCAL SCOPE MINOR DETAIL
resSpread: enlist `dateTime`spread`mean`up`low`operation!("p"$(priceX[.streamPair.i][`dateTime]);"f"$(s);"f"$(0);"f"$(0);"f"$(0);"f"$(0)); // MEAN AND STD FROM streamPair.i#.streamPair.spreads ?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mean, up, low and operation are always 0, why? I think you left something here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt know if the mean is the average of the already calculated spreads or from the historical data, same with standard deviation, and operation I need to implement it yet (I guess we use the same logic from python)
df['buy'] = df['spread'][((df['spread'] < df['lower']) & (df['spread'].shift(1) > df['lower']) | (df['spread'] < df['mean']) & (df['spread'].shift(1) > df['mean']))]
df['sell'] = df['spread'][((df['spread'] > df['upper']) & (df['spread'].shift(1) < df['upper']) | (df['spread'] > df['mean']) & (df['spread'].shift(1) < df['mean']))]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I have some doubts here. However, I believe trading signals should also be calculated using historical data. So, you would calculate spreads over the historical dataset (using the same linear regression) and then calculate the mean and standard deviation for those historical spreads. Does that make sense?

streamPair.q Outdated Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
streamPair.q Show resolved Hide resolved
streamPair.q Outdated Show resolved Hide resolved
@chraberturas chraberturas marked this pull request as ready for review February 16, 2024 11:15
Kokechacho and others added 5 commits February 16, 2024 12:38
Fixed EWMA values, now Spreads table has up and up2 columns refered to static std and moving std respectively, and also added an ewma column for ewma deviations values
Take away some innecesary lines
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Basic Study of Pair Trading Using Linear Regression
2 participants