R code examples to teach basic Web scraping with rvest and related packages.
Used at a two-day workshop in November 2018: refer to the introductory slides, in French, for details.
Please report any bugs or errors in the issues of this repository, or email me.
DEMOS
lagasafn* legal cross-references in Icelandic lawjorf* XML field extraction from the French Official Journalcop21* word extraction from the UNCC Paris Accordqosd* keyword co-occurrence in French parliamentary questions
Projects mentioned but not included in the repository:
marsad* voting behaviour in the Tunisian parliamentparlnet* bill cosponsorship in European parliamentsparlviz* interactive visualizations of the above
Slides shown but not included in the repository (available on request):
- "Large-scale legislative data collection from online sources" (2016)
- "Web scraping et APIs avec R" (2017)
HOWTO
- Run the
dependencies.rscript to install all required packages. - Run each code folder separately. Each has its own
.Rprojfile.
THANKS
- Sabrina Granger and Isabelle Scarpat-Bouvet for excellent logistics.
- Thomas J. Leeper for his
word_countfunction, used in thecop21example. - Emiliano Grossman for inspiring the
qosdexample.