I have to write some CLI applications from time to time and in this blog post I want to tell you how I do it. We will talk about the Symfony2's Console component which makes such tasks trivial and even pleasant.
We're still on our way to better data scraping. We did a great job last time, but it's time for improvements. Now I want to introduce another even higher level tool for data scraping - Goutte. It's originally written by Fabien Potencier, the creator of the Symfony framework and is now maintained by FriendsOfPHP.
We're moving forward in our way in understanding the best tools for web scraping. In this blog post I want to introduce the Symfony2 DomCrawler component. It provides even more flexibility and you will definitely love it.
We already checked the SimpleXML and DOM extensions. Now I want to show how you can improve work with DOM using the Symfony2 component called CssSelector.
In the previous blog post I described how we can scrape data using SimpleXML, its pros and cons. Now I want to go one step further and introduce a little bit more convenient way to do this - DOM extension.
Time to time developers need to parse content to extract needed data from it. Usually it's just HTML pages, but sometimes you need to scrape data from more advanced sites where you have to use more powerful tools. In this blog posts seria I want to show you how you can accomplish this. I'll describe approaches one by one and show their pros and cons. First of all, together we will check what PHP proposes us out of the box to work with XML (SimpleXML and DOM). Then we will explore more and more powerful libraries like CssSelector, DomCrawler, Goutte and CasperJs that can help you achieve all needed goals and make your life much much easier and pleasant. Are you ready to dive in? Let's go then.