Dark Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Latest commit

History

History

data_collector

README.md

Data Collector

Introduction

Scripts for data collection

Custom Data Collection

Specific implementation reference: https://github.com/microsoft/qlib/tree/main/scripts/data_collector/yahoo

  1. Create a dataset code directory in the current directory
  2. Add collector.py
    • add collector class:
      CUR_DIR = Path(__file__).resolve().parent
      sys.path.append(str(CUR_DIR.parent.parent))
      from data_collector.base import BaseCollector, BaseNormalize, BaseRun
      class UserCollector(BaseCollector):
      ...
    • add normalize class:
      class UserNormalzie(BaseNormalize):
      ...
    • add CLI class:
      class Run(BaseRun):
      ...
  3. add README.md
  4. add requirements.txt

Description of dataset

Basic data
Features Price/Volume:
- $close/$open/$low/$high/$volume/$change/$factor
Calendar .txt:
- day.txt
- 1min.txt
Instruments .txt:
- required: all.txt;
- csi300.txt/csi500.txt/sp500.txt
  • Features: data, digital
    • if not adjusted, factor=1

Data-dependent component

To make the component running correctly, the dependent data are required

Component required data
Data retrieval Features, Calendar, Instrument
Backtest Features[Price/Volume], Calendar, Instruments