技术笔记
python网页爬虫request采集
 moons   2021-03-21 17:56:20   188
专栏分类: python
    # html 网页采集请求类
    import requests
    import bs4
    
    class Request(object):
        def get_resource(self,url, params={}):
            try:
                resp = requests.get(url, params)
                resp.encoding = 'utf-8'
                soup = bs4.BeautifulSoup(resp.text, 'html.parser')
                return soup
            except Exception as e:
                # 异常请求重试
                print(e)
    
        def get_resource_txt(self,url, params={}):
            try:
                resp = requests.get(url, params)
                resp.encoding = 'utf-8'
                return resp.text
            except Exception as e:
                # 异常请求重试
                print(e)
    
	
调用get_resource_txt可以获取到html页面的源码
调用get_resource可以获取到bs4转义后的标签格式
Copyright © mos360.cn By Moons Soft Studio 百度统计