Skip to content

Scrape and analyze the Federal Reserve Beige Book reports

Notifications You must be signed in to change notification settings

ramgulsin/beige-book

 
 

Repository files navigation

Beige Book

We use several off-the-shelf text sentiment analysis tools to analyze the sentiment of the Fed's Beige Book reports from 1970--2020.

The raw text data is scraped from the Minneapolis Fed by scrape.py and stored in txt. Do whatever you want with this data (I clearly do not own the copyright).

The reports are scored in sentiment.py using three pre-trained models. I am in the process of adding more. So far the models used are

  • VADER from NLTK
  • Pattern analyzer from TextBlob
  • LSTM text classifier from flair

The analysis is done in analysis.py, where we get the following graphs:

GDP Growth Rate Comparison

Sentiment by District

Dependencies

Dependencies are listed in requirements.txt. Tested + Developed on Python 3.8.

  • scrape.py
    • requests
    • beautifulsoup4
  • files.py
    • pandas
  • clean.py
    • cleantext
  • sentiment.py
    • pandas
    • nltk
    • textblob
    • flair
    • transformers
  • analysis.py
    • numpy
    • pandas
    • statsmodels
    • matplotlib

TODO

  • Fix parsing errors
    • Bug with <br> tag instead of <br /> (no more breaks)
    • Remove <strong> (ignored)
    • &nbsp; problem (check if this gets removed)
    • Delete "learn more" <p> at the bottom (grep -RIl "www\." txt/)
  • Find missing/incomplete files
    • Some files are empty
    • Analyze errors.txt
  • Grab missing files
    • Grab missing 2016-0(4|6)-su files
    • Grab missing 2015-07-* files
    • Try to find missing 1971-01-bo
  • Clean text
    • Replace &%-+ with text?
    • Replace numbers with words
    • Check that text is ASCII
  • Run sentiment analysis
    • Check out flair package
    • flair gives values x<-0.5 | x>0.5 (fixed in analysis)
    • Check if all text is used or just first n words
    • transformers package
    • Just extract numbers (bigger is better)
  • Get exact dates of publication
  • Generate histograms
    • Normalize values
    • Check out outlier (1971-01-bo missing doc)
  • Regress national sentiment on regional sentiments
    • Do you add a constant here?
    • See if coefficient sum to 1
    • Create proxy measure including all regions
  • Graph time series
    • Pretty up plots (title+legend)
    • Get GDP data
    • Check stock market data
    • Think about timing of Beige Book data
    • By region in a grid
    • Bond yields
  • Time series regression
  • Investigate discrepancies between sentiment scores
    • In su TextBlob is high during 1974 recession and higher during 1990s boom
  • Add info + pictures to README.md
  • GDP Growth by region
    • Aggregate state data

About

Scrape and analyze the Federal Reserve Beige Book reports

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%