>> s = gfw.replace("shit!,Cherry is a sexy girl. She loves python.","*") >>> print" />
红联Linux门户
Linux帮助

基于DFA的敏感词检测和替换模块 SmallGFW

发布时间:2012-11-07 21:29:56来源:红联作者:empast
smallgfw: 一个基于DFA的敏感词检测和替换模块,用法如doctest所示。[code]>>> gfw = GFW()
>>> gfw.set(["sexy","girl","love","shit"])#设置敏感词列表
>>> s = gfw.replace("shit!,Cherry is a sexy girl. She loves python.","*")
>>> print s
*!,Cherry is a * *. She *s python. #屏蔽后的效果

>>> gfw = GFW()
>>> gfw.set(["abd","defz","bcz"])
>>> print gfw.check("xabdabczabdxaadefz") #检测敏感词的出现位置
[(1, 3, 'abd'), (5, 3, 'bcz'), (8, 3, 'abd'), (14, 4, 'defz')] #例如,(5, 3, 'bcz')表示下标5之后长度为3的子串[/code]主页:http://code.google.com/p/smallgfw/

下载:http://code.google.com/p/smallgfw/downloads/list

来自:开源中国社区
文章评论

共有 0 条评论