# 概述

### PHP蜘蛛爬虫开发文档 <a href="#php-zhi-zhu-pa-chong-kai-fa-wen-dang" id="php-zhi-zhu-pa-chong-kai-fa-wen-dang"></a>

《我用爬虫一天时间“偷了”知乎一百万用户，只为证明PHP是世界上最好的语言 》所使用的程序框架

编写PHP网络爬虫, 需要具备以下技能:

* 爬虫采用PHP编写
* 从网页中抽取数据需要用XPath ( [XPath选择器教程](http://www.w3school.com.cn/xpath/index.asp) )
* 当然我们还可以使用CSS选择器 ( [CSS选择器教程](http://www.w3school.com.cn/cssref/css_selectors.asp) )
* 很多情况下都会用到正则表达式 ( [正则表达式教程](https://www.w3cschool.cn/regexp/) )
* Chrome的开发者工具是神器, 很多AJAX请求需要用它来分析

**注意：本框架只能在命令行下运行，命令行、命令行、命令行，重要的事情说三遍 ^\_^**

### results matching ""

*

### No results matching ""


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zhai-shi-sansorganization.gitbook.io/phpspider/gai-shu.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.