Tag Archive: wikipedia

Jun 16

Getting all the links from a MediaWiki format using PyParsing

Hi, Just sharing a snippet of code. Part of a project I'm doing, I need to analyse the links in the Wikipedia corpus. While using the API is one solution, it doesn't retain the order of where links appear in the page. It also returns links that are not part of the main text, which …

Continue reading »

Share