爱玺玺

爱玺玺的生活日记本。wx:lb87626

scrapy爬取第一个页面

继续上一篇文章

在爬虫文件myspider.py 写代码:

import scrapy



class MyspiderSpider(scrapy.Spider):

#爬虫名

    name = 'itcast'

    #允许爬的域名

    allowed_domains = ['itcast.cn']

    #爬虫爬的url

    start_urls = ['http://www.itcast.cn/channel/teacher.shtml#apython']


    def parse(self, response):

        with open("teacher.html","wb") as f:

        f.write(response.body)



然后在爬虫文件所在目录运行命令,就开始爬取了

1596866939(1).jpg


发表评论:

Powered By Z-BlogPHP 1.4 Deeplue Build 150101

Copyright Your WebSite.Some Rights Reserved.

蜀ICP备11021721号-5