Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
859 views
in Technique[技术] by (71.8m points)

how to select rows that contains non-english characters in sql server 2005(it should filter only non-english chars, not special characters)

As my table contains non-English(contains characters in different languages) characters and special characters in a column. I need filter only non-English characters. It should filter any special characters.

i tried using different methods to filter but failed to filter few rows. someone please help me on this. Thanks in advance.

ex: column name LOCATION contains following rows :

row 1: ??? ??????????? ????????, North Street, Idyanvillai, Tamil Nadu, India

row 2:Dr.Hakim M.Asgar Ali's ROY MEDICAL CENTRE? Unani Clinic In Kerala India, Thycaud Hospital Road, Opp. Amritha Hotel,, Thycaud.P.O.,, Thiruvananthapuram, Kerala, India

row 3: ???????? ???? ????????, Shivaji Nagar, Davangere, Karnataka, India

As the above contains characters in many language. can any one help me to select only row 2 thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

T-SQL's string-handling capability is pretty rudimentary.

If the "non-English" fields are distinguished by their use of Unicode UTF-16, you can try something like

SELECT * FROM MyTable WHERE MyField = Cast(MyField AS VARCHAR)

to pull only rows that are expressible in UTF-8.

The only way I know how to test whether a field is drawn from an arbitrary set of characters is with a user-defined function, like this:

CREATE FUNCTION IsAllowed (@input VARCHAR(MAX)) RETURNS BIT
-- Returns 1 if string is allowed, 0 otherwise.
-- Usages: SELECT dbo.IsAllowed('Hello'); -- returns 1
--         SELECT dbo.IsAllowed('Hello, world!'); -- returns 0
-- Note CHARINDEX is not case sensitive so @allowables doesn't need both.
--      VARCHAR(MAX) is different under SQL Server 2005 than 2008+
---     and use of defined VARCHAR size might be necessary.
AS
BEGIN
  DECLARE @allowables char(26) = 'abcdefghijklmnopqrstuvwxyz';
  DECLARE @allowed int = 0; 
  DECLARE @index int = 1;
  WHILE @index <= LEN(@input)
    BEGIN
    IF CHARINDEX(SUBSTRING(@input,@index,1),@allowables)=0
      BEGIN
      SET @allowed = 0;
      BREAK;
      END
    ELSE
      BEGIN
      SET @allowed = 1;
      SET @index = @index+1;
      END
    END
  RETURN @allowed
END

User-defined functions can be applied to columns in SELECT, like this:

SELECT * FROM MyTable WHERE dbo.IsAllowed(MyField) = 1

Note the schema name (dbo in this case) is not optional with user-defined functions.

If a T-SQL user-defined function is inadequate, you can also use a CLR Function. Then you could apply a regexp or whatever to a column. Because they break portability and pose a security risk, many sysadmins don't allow CLR functions. (This includes Microsoft's SQL Azure product.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...