一、需求
将2个html文件保存到本地浏览器,例如:
A页面(我的博客主页)
B页面(爬虫四大金刚)
然后将A页面中的爬虫链接,链接的a标签中的href属性修改成本地B页面的地址,实现在本地浏览A页面跳转到B页面
二、代码
parent_page=r"C:\Users\ffm11\Desktop\Maple_feng - 博客园.html"sub_page=r"C:\Users\ffm11\Desktop\爬虫四大金刚:requests,selenium,BeautifulSoup,Scrapy - Maple_feng - 博客园.html"with open(parent_page, 'r',encoding="utf-8") as file: pcontent = file.read()sp = BeautifulSoup(pcontent, 'lxml')'''[置顶] 爬虫四大金刚:requests,selenium,BeautifulSoup,Scrapy'''text=sp.find_all('a',class_='postTitle2')[0].get_text()print(text)new_tag = sp.new_tag("a")new_tag.attrs = { "href":sub_page,"class":"postTitle2"}new_tag.string = text# replace the paragraph using `replace_with` methodsp.find_all('a',class_='postTitle2')[0].replace_with(new_tag)# open another file for writingwith open(parent_page, 'w',encoding="utf-8") as fp: # write the current soup content fp.write(sp.prettify())