Categories
- Data (9)
- Learning Resources (9)
- Python (9)
- SAS (14)
- Stata (24)
- Uncategorized (1)
-
Recent Posts
Recent Comments
- carlos rivas on Use Python to download lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse
- Kai Chen on Stata command to create Fama-French industry classifications based on SIC codes
- Kai Chen on Stata command to create Fama-French industry classifications based on SIC codes
- Philippe Dubois on Stata command to create Fama-French industry classifications based on SIC codes
- carlos rivas on Use Python to download lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse
Archives
- February 2022
- January 2022
- August 2021
- March 2021
- September 2019
- August 2019
- July 2019
- May 2019
- April 2019
- March 2019
- August 2018
- July 2018
- December 2017
- November 2017
- October 2017
- September 2017
- April 2017
- September 2016
- April 2016
- December 2015
- November 2015
- September 2015
- August 2015
- June 2015
- March 2015
- February 2015
Meta
Category Archives: Python
My thoughts on Python for accounting research
There is a temptation for accounting PhD students to invest in learning Python. However, I would recommend accounting PhD students focus more on SAS + Stata than on Python in their first year for a few practical and technical reasons: … Continue reading
Posted in Learning Resources, Python
1 Comment
The art of regular expression
Regular expression is a powerful tool to do text search. It is the foundation of a lot of textual analysis research, though today’s textual analysis in computer science has gone far beyond text search. Regular expression operations are programming language … Continue reading
Use Python to download lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse
[Update on 2022-01-08] This website requires login now. I add a function to login and retrieve the protected content. The code may terminate (e.g., every 80 pages) due to timeout or connection error, and thus you may need to run it … Continue reading
Posted in Python
14 Comments
Use Python to extract URLs to HTML-format SEC filings on EDGAR
[Update on 2020-06-26] Eduardo has made a significant improvement to the code. Now you can specify a starting date and download the index file during the period from that starting date to the most recent date. I expect it to … Continue reading
Posted in Python
47 Comments
Use Python to download data from the DTCC’s Swap Data Repository
I helped my friend to download data from the DTCC’s Swap Data Repository. I am not familiar with the data and just use this as a programming practice. This article gives an introduction to the origin of the data: http://www.dtcc.com/news/2013/january/03/swap-data-repository-real-time The … Continue reading
Posted in Data, Python
Leave a comment
Use Python to download TXT-format SEC filings on EDGAR (Part II)
[Update on 2019-07-31] This post, together with its sibling post “Part I“, has been my most-viewed post since I created this website. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are … Continue reading
Posted in Data, Python
59 Comments
Use Python to extract Intelligence Indexing fields in Factiva articles
First of all, I acknowledge that I benefit a lot from Neal Caren’s blog post Cleaning up LexisNexis Files. Thanks Neal. Factiva (as well as LexisNexis Academic) is a comprehensive repository of newspapers, magazines, and other news articles. I first … Continue reading
Posted in Python
15 Comments
Use Python to calculate the tone of financial articles
[Update on 2019-03-01] I completely rewrite the Python program. The updates include: I include two domain-specific dictionaries: Loughran and McDonald’s and Henry’s dictionaries, and you can choose which dictionary to use. I add negation check as suggested by Loughran and … Continue reading
Posted in Python
14 Comments
Use Python to download TXT-format SEC filings on EDGAR (Part I)
[Update on 2019-07-31] This post, together with its sibling post “Part II“, has been my most-viewed post since I created this website. However, the landscape of 10-K/Q filings has changed dramatically over the past decade, and the text-format filings are extremely … Continue reading
Posted in Data, Python
67 Comments