Category Archives: Data

Link FactSet and CRSP

Both FactSet and CRSP offer identifier files that contain primary identifiers at the entity level and security level (note: an entity may issue multiple securities). These files provide a mapping between their primary identifiers and all other historical identifiers such … Continue reading

Posted in Data | Tagged | 7 Comments

The calculation of average credit rating using ratings from three rating agencies

I was doing something in Finance and wanted to calculate the average rounded credit rating. Basically, I need to translate textual grades (e.g., AAA, Baa) to a numerical value. I found a clue in the following paper: Becker, B., and … Continue reading

Posted in Data | 1 Comment

Use Python to download data from the DTCC’s Swap Data Repository

I helped my friend to download data from the DTCC’s Swap Data Repository. I am not familiar with the data and just use this as a programming practice. This article gives an introduction to the origin of the data: http://www.dtcc.com/news/2013/january/03/swap-data-repository-real-time The … Continue reading

Posted in Data, Python | Leave a comment

Download FR Y-9C data from WRDS

WRDS currently populates FR Y-9C data quarter by quarter in individual datasets, like BHCF200803, BHCF200806, BHCF200809 and so on. WRDS has not stacked those individual datasets to formulate a single time-series dataset like COMPUSTAT. There are two ways to overcome … Continue reading

Posted in Data, SAS | 2 Comments

Use Python to download TXT-format SEC filings on EDGAR (Part II)

[Update on 2019-07-31] This post, together with its sibling post “Part I“, has been my most-viewed post since I created this website. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are … Continue reading

Posted in Data, Python | 59 Comments

How to remove duplicate GVKEY-DATADATE when using Compustat Annual (FUNDA) and Quarterly (FUNDQ)?

The annual data (FUNDA) is easy to deal with, we just need to apply the following conditions: indfmt==”INDL” & datafmt==”STD” & popsrc==”D” & consol==”C” If we have converted FUNDA to Stata format, the uniqueness of GVKEY-DATADATE can be verified using … Continue reading

Posted in Data, Stata | 18 Comments

EDGAR index files in Stata dataset (from 1993 Q1 to March 2, 2017)

SEC makes all EDGAR filings publicly available. We can download all 10-Ks, 10-Qs, 8-Ks filed since 1993. However, SEC makes this far away from just a few mouse clicks (in order to reduce the server load and avoid the possible abuse … Continue reading

Posted in Data | 16 Comments

Link RSSD with PERMCO

If you are working on bank holding company data, such as FR-Y9C, you may need to link the unique identifier (RSSD) in the data to the unique identifier (PERMCO) in CRSP. Federal Reserve Bank of New York provides such a … Continue reading

Posted in Data | 7 Comments

Use Python to download TXT-format SEC filings on EDGAR (Part I)

[Update on 2019-07-31] This post, together with its sibling post “Part II“, has been my most-viewed post since I created this website. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are extremely … Continue reading

Posted in Data, Python | 67 Comments