Lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse

The Python script in the original post has been removed as its use violates the Terms of Service of the data provider.

Stanford Law School’s Securities Class Action Clearinghouse is always happy to share the data (subject to a Non-Disclosure Agreement) with academic researchers for non-commercial research or analysis. If you have any data needs, please contact their SCAC Content Manager at scac@law.stanford.edu.

This entry was posted in Python and tagged . Bookmark the permalink.

14 Responses to Lawsuit data from Stanford Law School’s Securities Class Action Clearinghouse

  1. Griffin Geng says:

    Awesome! Thanks for sharing!

  2. Tigran says:

    I was about to go through building a scraper for this from scratch… you saved me so much time! This is great!

  3. Tianhua says:

    Hi Dr. Chen,
    Thanks so much for this coding. I just got stuck in using this codes as the Securities Class Action Clearinghouse requires login to get the full data. I tried “mechanize” pckage to login but it doesn’t work. Do you have any ideas about how to get the access to the website?

  4. Pengyuan li says:

    added error handling in get_class_period method to avoid the issue if the case’s status is currently Active.

    def get_class_period(soup):
    section = soup.find(“section”, id=”fic”)
    try:
    text = section.find_all(“div”, class_=”span4″)
    start_date = text[4].get_text()
    end_date = text[5].get_text()
    except:
    start_date = ‘null’
    end_date = ‘null’
    return start_date, end_date

    • Md Enayet Hossain says:

      Thanks for the correction. But this only solves the error issue. It does not return the class period for any lawsuits. Any idea how I can get the class period and access the lawsuit files? The html does not even show the contents beyond case summary.

  5. Yuchen says:

    how do you parse settlement value?

  6. Mengxi Chen says:

    Hi Dr. Chen and Shiyu! Thank you so much for sharing this! I appreciate it!

  7. Elisha Yu says:

    Thank you so much Dr.Chen!

    Just a small note: you need to set the Chrome default to maximize the window, or add this before line 18:
    driver.maximize_window()

  8. Thanks for sharing the files and codes, very useful!

  9. Lin says:

    Dr. Chen,

    This is awesome. Thank you for your generous sharing!

  10. carlos rivas says:

    hi kai,

    thanks again for making your code available.
    iam also trying to scrape the legal documents/pdfs.
    this code works to download other url pdf files but not the pdfs from stanford class action clearinghouse(the only difference, i can see is that there is login required but i am already logged in by the time we reach this code):
    import requests
    # file_url = “https://www.bu.edu/econ/files/2014/08/DLS1.pdf”
    file_url1 = ‘http://securities.stanford.edu/filings-documents/1080/IBMC00108070/2023113_f01c_23CV00332.pdf’
    r = requests.get(file_url1, stream = True)

    with open(“C:/Users/inter/OneDrive/Desktop/securities_class_action_docs/test.pdf”, “wb”) as file:
    for block in r.iter_content(chunk_size = 1024):
    if block:
    file.write(block)

Leave a Reply to Mengxi Chen Cancel reply

Your email address will not be published. Required fields are marked *