在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称:FriendsOfPHP/Goutte开源软件地址:https://github.com/FriendsOfPHP/Goutte开源编程语言:PHP 100.0%开源软件介绍:Goutte, a simple PHP Web ScraperGoutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. RequirementsGoutte depends on PHP 7.1+. InstallationAdd composer require fabpot/goutte UsageCreate a Goutte Client instance (which extends
use Goutte\Client;
$client = new Client(); Make requests with the // Go to the symfony.com website
$crawler = $client->request('GET', 'https://www.symfony.com/blog/'); The method returns a To use your own HTTP settings, you may create and pass an HttpClient instance to Goutte. For example, to add a 60 second request timeout: use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;
$client = new Client(HttpClient::create(['timeout' => 60])); Click on links: // Click on the "Security Advisories" link
$link = $crawler->selectLink('Security Advisories')->link();
$crawler = $client->click($link); Extract data: // Get the latest post in this category and display the titles
$crawler->filter('h2 > a')->each(function ($node) {
print $node->text()."\n";
}); Submit forms: $crawler = $client->request('GET', 'https://github.com/');
$crawler = $client->click($crawler->selectLink('Sign in')->link());
$form = $crawler->selectButton('Sign in')->form();
$crawler = $client->submit($form, ['login' => 'fabpot', 'password' => 'xxxxxx']);
$crawler->filter('.flash-error')->each(function ($node) {
print $node->text()."\n";
}); More InformationRead the documentation of the BrowserKit, DomCrawler, and HttpClient Symfony Components for more information about what you can do with Goutte. PronunciationGoutte is pronounced Technical InformationGoutte is a thin wrapper around the following Symfony Components: BrowserKit, CssSelector, DomCrawler, and HttpClient. LicenseGoutte is licensed under the MIT license. |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论