-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[ENH] Remove yfinance as a dependency and implement data_loader#721
[ENH] Remove yfinance as a dependency and implement data_loader#721Shuvam586 wants to merge 3 commits intoPyPortfolio:mainfrom
Conversation
|
instead of editing tickers mentioned in the also removed |
fkiraly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks!
May I request to not use data downloaded via yfinance from Yahoo services at all? This is due to terms of use, we should not distribute data from Yahoo services at all in the repository or package.
Could you instead use similar data? Either completely randomly generated (Brownian motion random walk or similar, with same column names and time index), or taking some inspiration from the actual data in how you randomize - but it cannot be the exact values.
|
Thanks! Could you kindly post here the plots in the notebooks before/after, just to check if they look similar? |
|
also, code formatting tests are failing, please look at |
|
i have run |
|
Thanks! I suspect the actual data would be closer to exponential Brownian motion - that should be achieved by simply taking |
| return pd.read_csv(f, **read_csv_kwargs) | ||
|
|
||
|
|
||
| def load_stockdata(tickers: list = None, start: str = None, end: str = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add docstrings (numpydoc format)
|
|
||
|
|
||
| def available_tickers(): | ||
| df = _load_raw_data("stock_prices.csv", parse_dates=["date"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this not known in advance? Instad of loading the csv, you could simply load the header, or return the known list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, i will rewrite it to return only the list of the tickers in the csv file.
fkiraly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good!
- can you re-execute the notebooks after a clean reset?
- I think the simulated data should be exponential brownian motion to resemble the actual data
- please add numpydoc docstrings to the data loaders
- further comments above
|
non-blocking - would it be possible to include the simulation code somewhere in the |
i am confused. the code i used to generate the synthetic data uses exponential brownian motion. for t in range(1, n_days):
z = np.random.normal(size=len(tickers)) prices[t] = prices[t-1] * np.exp( (mu - 0.5 * sigma**2) * dt + sigma * np.sqrt(dt) * z ) |
yes. i re-executed the |