如何使用正则表达式替换 HTML 字符串中的指定片段?
替换字符串内容
如何将字符串中指定的片段替换为新的内容?以如下 HTML 字符串为例:
<!DOCTYPE html><html><head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <link rel="stylesheet" href="//test.baidu.com/533fb44/umi.10d72219.css"> <script>window.publicPath = window.__INJECTED_PUBLIC_PATH_BY_QIANKUN__ || "//test.baidu.com/533fb44/";</script> </head> <body> <div id="root"></div> <script src="//test.baidu.com/533fb44/umi.b271d884.js" entry=""></script>
需求:将 替换为 。
解决方案:
可以使用字符串的 replace 方法来实现替换操作:
import re str = '''<!DOCTYPE html><html><head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <link rel="stylesheet" href="//test.baidu.com/533fb44/umi.10d72219.css"> <script>window.publicPath = window.__INJECTED_PUBLIC_PATH_BY_QIANKUN__ || "//test.baidu.com/533fb44/";</script> </head> <body> <div id="root"></div> <script src="//test.baidu.com/533fb44/umi.b271d884.js" entry=""></script> ''' replaced_str = re.sub(r"window.publicPath = window.__INJECTED_PUBLIC_PATH_BY_QIANKUN__ || ", "window.publicPath = ", str) print(replaced_str)
上面的正则表达式模式 window.publicPath = window.__INJECTED_PUBLIC_PATH_BY_QIANKUN__ || 匹配需要替换的字符串片段。
运行后,输出修改后的字符串:
<!DOCTYPE html><html><head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0"> <meta http-equiv="X-UA-Compatible" content="ie=edge"> <link rel="stylesheet" href="//test.baidu.com/533fb44/umi.10d72219.css"> <script>window.publicPath = "//test.baidu.com/533fb44/";</script> </head> <body> <div id="root"></div> <script src="//test.baidu.com/533fb44/umi.b271d884.js" entry=""></script>