正則表達過濾中文_ORACLE中怎樣用正則表達式過濾中文字元

⑴ 如何用正則表達式去除字元串的中文

⑵ 正則表達式篩選中文

preg_match("/<\/label>[\s]*(?:<span.*?>)?(.+?)(?:<\/span>)?[\s]*<li>/is", $test, $getcontent);
echo $getcontent[1];
//你試試看

⑶ 正則表達式過濾中文

/^(^([\\u4E00-\\u9FA5]|[\\uFE30-\\uFFA0]))*$/

你是要這個吧? ^在正則表達式中,還有字元串開始的意思....

⑷ 怎麼使用正則表達式表示漢字，目的是要在notepad++篩選出所有漢字，編碼格式為ANSI

^我剛在在notepad++中試來過了，貌似它的正則表源達式引擎是單位元組的。用[\u4e00-\u9fa5]與[^\x00-\xff]都不能正常的匹配中文。實在是郁悶，在java、C#、JS裡面編程都可以用「[\u4e00-\u9fa5]」來匹配中文的，於是想\u4e00不就是「一」，\u9fa5不就是「龥」嗎？於是使用：
[一-龥]
正常查找到所有的中文字。這時，中文標點符號沒有匹配，加上[\uFF01-\uFF5E]，即[！-～]，完美解決。。。

於是得出，notepad++、UltraEdit中匹配中文的正則表達式為：
[一-龥！-～]

如果沒有解決你的問題，請發網路消息給我。

⑸ ORACLE中怎樣用正則表達式過濾中文字元

從表裡提取漢字, 需要考慮字元集, 不同的字元集漢字的編碼有所不同
這里以GB2312為例, 寫一函數准確地從表裡提取簡體漢字.

假設資料庫字元集編碼是GB2312, 環境變數(注冊表或其它)的字元集也是GB2312編碼
並且保存到表裡的漢字也都是GB2312編碼的

那麼也就是漢字是雙位元組的，且簡體漢字的編碼范圍是
B0A1 - F7FE
換算成10進制就是
B0 A1 F7 FE
176,161 - 247,254

我們先看一下asciistr函數的定義
Non-ASCII characters are converted to the form \xxxx, where xxxx represents a UTF-16 code unit.
但是這並不表示以 "\" 開始的字元就是漢字了

舉例如下
SQL> select * from test;

NAME
--------------------
,啊OO10哈
你好aa
大家好aa/
☆大海123
★ABC

這里第5條記錄有一個實心的五角星
然後用asciistr函數轉換一下試試
SQL> select name,asciistr(name) from test;

NAME ASCIISTR(NAME)
-------------------- ----------------------
,啊OO10哈 ,\554AOO10\54C8
你好aa \4F60\597Daa
大家好aa/ \5927\5BB6\597Daa/
☆大海123 \2606\5927\6D77123
★ABC \2605ABC

我們看到最後一條記錄的實心五角星也是 "\"開頭的
此時我們就不能用asciistr(欄位)是否存在 "\" 來判斷是否含有漢字了.

我的函數如下，基本思路是判斷字元的編碼是否在GB2312規定的漢字編碼范圍之內
[PHP]
create or replace function get_chinese(p_name in varchar2) return varchar2
as
v_code varchar2(30000) := '';
v_chinese varchar2(4000) := '';
v_comma pls_integer;
v_code_q pls_integer;
v_code_w pls_integer;
begin
if p_name is not null then
select replace(substrb(mp(p_name,1010),instrb(mp(p_name,1010),'ZHS16GBK:')),'ZHS16GBK: ','') into v_code from al where rownum=1;
for i in 1..length(p_name) loop
if lengthb(substr(p_name,i,1))=2 then
v_comma := instrb(v_code,',');
v_code_q := to_number(substrb(v_code,1,v_comma-1));
v_code_w := to_number(substrb(v_code,v_comma+1,abs(instrb(v_code,',',1,2)-v_comma-1)));
if v_code_q>=176 and v_code_q<=247 and v_code_w>=161 and v_code_w<=254 then
v_chinese := v_chinese||substr(p_name,i,1);
end if;
v_code := ltrim(v_code,'1234567890');
v_code := ltrim(v_code,',');
end if;
v_code := ltrim(v_code,'1234567890');
v_code := ltrim(v_code,',');
end loop;
return v_chinese;
else
return '';
end if;
end;
/
.
[/PHP]

好，現在來執行一些語句
SQL> select * from test;

NAME
--------------------
,啊OO10哈
你好aa
大家好aa/
☆大海123
★ABC

5 rows selected.

1. 列出有漢字的記錄
SQL> select name from test where length(get_chinese(name))>0;

NAME
--------------------
,啊OO10哈
你好aa
大家好aa/
☆大海123

4 rows selected.

2. 列出有漢字的記錄，並且只列出漢字

SQL> select get_chinese(name) from test where length(get_chinese(name))>0;

GET_CHINESE(NAME)
---------------------------------------------------------------------------
啊哈
你好
大家好
大海

4 rows selected.

需要說明的是GB2312共有6763個漢字，即72*94-5=6763
我這里是計算72*94，沒有減去那5個，那五個是空的。等查到了再減去
============

改寫這個函數，可以提取非漢字或者漢字
該函數有兩個參數，第一個表示要提取的字元串，第二個是1，表示提取漢字，是非1，表示提取非漢字

[PHP]
create or replace function get_chinese
(
p_name in varchar2,
p_chinese in varchar2
) return varchar2
as
v_code varchar2(30000) := '';
v_chinese varchar2(4000) := '';
v_non_chinese varchar2(4000) := '';
v_comma pls_integer;
v_code_q pls_integer;
v_code_w pls_integer;
begin
if p_name is not null then
select replace(substrb(mp(p_name,1010),instrb(mp(p_name,1010),'ZHS16GBK:')),'ZHS16GBK: ','') into v_code from al where rownum=1;
for i in 1..length(p_name) loop
if lengthb(substr(p_name,i,1))=2 then
v_comma := instrb(v_code,',');
v_code_q := to_number(substrb(v_code,1,v_comma-1));
v_code_w := to_number(substrb(v_code,v_comma+1,abs(instrb(v_code,',',1,2)-v_comma-1)));
if v_code_q>=176 and v_code_q<=247 and v_code_w>=161 and v_code_w<=254 then
v_chinese := v_chinese||substr(p_name,i,1);
else
v_non_chinese := v_non_chinese||substr(p_name,i,1);
end if;
v_code := ltrim(v_code,'1234567890');
v_code := ltrim(v_code,',');
else
v_non_chinese := v_non_chinese||substr(p_name,i,1);
end if;
v_code := ltrim(v_code,'1234567890');
v_code := ltrim(v_code,',');
end loop;
if p_chinese = '1' then
return v_chinese;
else
return v_non_chinese;
end if;
else
return '';
end if;
end;
/

.
[/PHP]
SQL> select * from a;

NAME
--------------------
我們啊、
他（艾呀）是★們
他的\啊@

SQL> select get_chinese(name,1) from a;

GET_CHINESE(NAME,1)
-----------------------------------------
我們啊
他艾呀是們
他的啊

SQL> select get_chinese(name,0) from a;

GET_CHINESE(NAME,0)
-----------------------------------------
、
（）★
\@

SQL>

⑹ java正則表達式過濾特殊字元只允許中文、字母和數字, 該怎麼寫急。。。

^~|||String str = "*(^YUIGHUGU^^&*()*6哈哈89324328uewh~!@#$%^&*()_+,./<>?;':[]\\{}|-=";//要過濾的字元串
str = str.replaceAll("[\\pP|~|$|^|<|>|\\||\\+|=]*", "");
System.out.println(str);
輸出內結果容：YUIGHUGU6哈哈89324328uewh

⑺ java字元串裡面如何用正則表達式去掉漢字

public static void main(String[] args) {

// TODO Auto-generated method stub

String str = "123abc你好efc";

String reg = "[u4e00-u9fa5]";

Pattern pat = Pattern.compile(reg);

Matcher mat=pat.matcher(str);

String repickStr = mat.replaceAll("");

System.out.println("去中文後:"+repickStr);

}

⑻ python中正則表達式怎麼過濾中文日期類型

defdouble(matched):
value=int(matched.group('value'))
if(value<10):
return"0"+str(value);
else:
returnstr(value);
s='《2017年制7月3日》';
s=re.sub('(?P<value>d+)',double,s);
s=re.sub(r'D','',s);
prints;

s='《2017年6月5日與6月12日合集》';
s=re.sub('(?P<value>d+)',double,s);
s=re.sub('與','-',s)
s=re.sub(r'[^d-]','',s);
prints;

⑼ 求一個正則表達式可以匹配：中文字元，中文標點符號，英文，數字，下劃線。但不能輸入@、# 等特殊字元.

\w+|[，。《》（）、—]+

\w匹配：中文字元，英文，數字，下劃線
至於中文標點符號，看你需要了，如果有另外的就添加在中括弧裡面。

⑽ 正則表達式能過濾中文特殊字元嗎

String s1="我是復正確制測試數據aasdf2342343ASFASDF"; String s2="我是錯誤測試數據@#！@#"; String reg = "[^0-9a-zA-Z\u4e00-\u9fa5]+"; System.out.println(s1.replaceAll(reg,"")); System.out.println(s2.replaceAll(reg,""));

導航:首頁 > 凈水問答 > 正則表達過濾中文

正則表達過濾中文

與正則表達過濾中文相關的資料