Analyzing Company Filings for Stock Selection – A Practical Report
Speaker: Laura Jehl
Summary
This talk will present our work on gathering and analyzing 10K and 10Q filings using NLP techniques. These publicly available quarterly and annual reports contain valuable information for stock holders on the performance of listed companies. We explore how information can be extracted automatically from the full-text and deployed in a quantitative stock selection model.
Description
This talk will present our work on gathering and analyzing 10K and 10Q filings using NLP techniques. These publicly available quarterly and annual reports contain valuable information for stock holders on the performance of listed companies. We explore how information can be extracted automatically from the full-text and deployed in a quantitative stock selection model. While the NLP methods presented are not novel, we highlight lessons learned from processing the data and transfering academic results into a real-world application.
On the technical side, the talk will (1) sketch our Python pipeline for data set construction and daily updating; (2) describe the methods for analyzing content within reports and relations between reports. On the analytical side, we present experimental results and discuss challenges which arise when determining the usefulness of these methods in the context of a financial model.
We will see that - Analyzing this data set requires large-scale resources, and keeping information up-to-date can be tricky. - Simple bag-of-words based methods for document similarity, readability and sentiment can easily be implemented.Evaluations in a model context show promise, but putting them into practice poses additional challenges. - Promising research on measuring competitiveness using network analysis is hard to replicate when evaluated with more business-related metrics.
The talk will be application-focused. It should be of interest to developers and researchers looking into financial NLP, and anyone interested in the company filings data set.
Laura Jehl's Bio
Currently working as a researcher (Team Forecasts) at Quoniam Asset Management GmbH.
PyData Global 2021
Website: https://pydata.org/global2021/
LinkedIn: https://www.linkedin.com/company/pydata-global
Twitter: https://twitter.com/PyData
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps