python爬虫案例：手刃电影天堂

Python 更新时间：2024-05-10 14:28:12 发布时间：638天前 IT归档最新发布模块sitemap 名妆网法律咨询聚返吧英语巴士网伯小乐网商动力

废话不多说，上代码：

import requests
import re

#提取主页源代码
domain = "https://www.dytt89.com/"
resp = requests.get(domain)
resp.encoding = 'gbk'
#定位必看热片 保存至movie
obj1 = re.compile(r'2022必看热片.*?(?P.*?)',re.S)
result1 = obj1.finditer(resp.text)
movie = result1.__next__().group('movie')
resp.close()
#提取子页面链接
obj2 = re.compile(r".*?)' title",re.S)
result2 = obj2.finditer(movie)
child_href_list = [] #保存电影链接地址
for i in result2:
    child_href_list.append(domain+i.group('href')) #加上主页网址domain后保存至列表 提取成功！
# 提取子页面的下载地址并保存至文件
obj3 = re.compile(r'◎片　　名(?P.*?)
.*?magnet',re.S)
f = open(file='movies_download.txt',mode='w',encoding='utf-8')
for href in child_href_list:
    resp = requests.get(href)
    resp.encoding='gbk'
    child_href = obj3.search(resp.text)
    print(child_href.group('load'))
    resp.close()
    f.write(child_href.group('load')+'nn')
f.close()

转载请注明：文章转载自 www.wk8.com.cn

本文地址：https://www.wk8.com.cn/it/1037082.html

上一篇 ABAP 里文件操作涉及到中文字符集的问题和解决方案

下一篇 Python Scipy 自定义任意的概率分布

Python相关栏目本月热门文章

关于我们文章归档网站地图联系我们