# 如何实现多服务器集群爬虫？

### 如何实现多服务器集群爬虫？ <a href="#ru-he-shi-xian-duo-fu-wu-qi-ji-qun-pa-chong" id="ru-he-shi-xian-duo-fu-wu-qi-ji-qun-pa-chong"></a>

> 很多时候，单机器爬取的效率并不高，对于京东、淘宝这种动则上千万页面的网站，真的会爬到天荒地老，如何快速爬取成了当今爬虫最难的课题，要说破解防盗页面以及内容正则匹配提取，真的是特别的小儿科。\
> 现在PHPSpider框架自带了集群功能，可以让初学者很轻易的在多台机器上运行同一分代码实现多机器爬取。

下面我们看看运行多任务爬虫所需要的代码

```
$configs = array(
    'name' => '糗事百科测试样例',
    'multiserver' => true,  // 是否启动集群爬虫
    'serverid' => 1,        // 集群服务器ID
    ...
);
$spider = new phpspider($configs);
$spider->start();
```

运行界面：\
![](/files/diQWFP0Du6NEK63mcKXw)

### results matching ""

*

### No results matching ""


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zhai-shi-sansorganization.gitbook.io/phpspider/ru-he-shi-xian-duo-fu-wu-qi-ji-qun-pa-chong.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.