We already checked the SimpleXML and DOM extensions. Now I want to show how you can improve work with DOM using the Symfony2 component called CssSelector.
In the previous blog post I described how we can scrape data using SimpleXML, its pros and cons. Now I want to go one step further and introduce a little bit more convenient way to do this - DOM extension.
Time to time developers need to parse content to extract needed data from it. Usually it's just HTML pages, but sometimes you need to scrape data from more advanced sites where you have to use more powerful tools. In this blog posts seria I want to show you how you can accomplish this. I'll describe approaches one by one and show their pros and cons. First of all, together we will check what PHP proposes us out of the box to work with XML (SimpleXML and DOM). Then we will explore more and more powerful libraries like CssSelector, DomCrawler, Goutte and CasperJs that can help you achieve all needed goals and make your life much much easier and pleasant. Are you ready to dive in? Let's go then.