继续上一篇文章
在爬虫文件myspider.py 写代码:
import scrapy
class MyspiderSpider(scrapy.Spider):
#爬虫名
name = 'itcast'
#允许爬的域名
allowed_domains = ['itcast.cn']
#爬虫爬的url
start_urls = ['http://www.itcast.cn/channel/teacher.shtml#apython']
def parse(self, response):
with open("teacher.html","wb") as f:
f.write(response.body)
然后在爬虫文件所在目录运行命令,就开始爬取了