IT박스

SQL Server에서 선행 0을 트리밍하는 더 나은 기술은 무엇입니까?

itboxs 2020. 6. 15. 21:59
반응형

SQL Server에서 선행 0을 트리밍하는 더 나은 기술은 무엇입니까?


내가 사용했던 시간을 위해 :

SUBSTRING(str_col, PATINDEX('%[^0]%', str_col), LEN(str_col))

그러나 최근에는 '0'이 아닌 문자가 일치하지 않기 때문에 '00000000'과 같은 모든 "0"문자가있는 열에서 문제를 발견했습니다.

내가 본 대안 기술은 다음과 TRIM같습니다.

REPLACE(LTRIM(REPLACE(str_col, '0', ' ')), ' ', '0')

공백이 다시 "0"으로 바뀌면 공백이 "0"으로 바뀌기 때문에 임베드 된 공백이 있으면 문제가 있습니다.

스칼라 UDF를 피하려고합니다. SQL Server 2005에서 UDF와 관련된 많은 성능 문제를 발견했습니다.


SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col))

왜 값을 캐스트 INTEGER한 다음에 다시 돌려 보내지 VARCHAR않겠습니까?

SELECT  CAST(CAST('000000000' AS INTEGER) AS VARCHAR)

--------
       0

모두 0 (또는 단일 0)이있는 경우 고려하지 않는 다른 대답.
일부는 항상 빈 문자열을 0으로 기본 설정합니다. 빈 문자열로 유지해야 할 때 잘못되었습니다.
원래 질문을 다시 읽으십시오. 이것은 질문자가 원하는 것에 응답합니다.

해결책 # 1 :

--This example uses both Leading and Trailing zero's.
--Avoid losing those Trailing zero's and converting embedded spaces into more zeros.
--I added a non-whitespace character ("_") to retain trailing zero's after calling Replace().
--Simply remove the RTrim() function call if you want to preserve trailing spaces.
--If you treat zero's and empty-strings as the same thing for your application,
--  then you may skip the Case-Statement entirely and just use CN.CleanNumber .
DECLARE @WackadooNumber VarChar(50) = ' 0 0123ABC D0 '--'000'--
SELECT WN.WackadooNumber, CN.CleanNumber,
       (CASE WHEN WN.WackadooNumber LIKE '%0%' AND CN.CleanNumber = '' THEN '0' ELSE CN.CleanNumber END)[AllowZero]
 FROM (SELECT @WackadooNumber[WackadooNumber]) AS WN
 OUTER APPLY (SELECT RTRIM(RIGHT(WN.WackadooNumber, LEN(LTRIM(REPLACE(WN.WackadooNumber + '_', '0', ' '))) - 1))[CleanNumber]) AS CN
--Result: "123ABC D0"

솔루션 # 2 (샘플 데이터 포함) :

SELECT O.Type, O.Value, Parsed.Value[WrongValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.Value) = 0--And the trimmed length is zero.
             THEN '0' ELSE Parsed.Value END)[FinalValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.TrimmedValue) = 0--And the trimmed length is zero.
             THEN '0' ELSE LTRIM(RTRIM(Parsed.TrimmedValue)) END)[FinalTrimmedValue]
  FROM 
  (
    VALUES ('Null', NULL), ('EmptyString', ''),
           ('Zero', '0'), ('Zero', '0000'), ('Zero', '000.000'),
           ('Spaces', '    0   A B C '), ('Number', '000123'),
           ('AlphaNum', '000ABC123'), ('NoZero', 'NoZerosHere')
  ) AS O(Type, Value)--O is for Original.
  CROSS APPLY
  ( --This Step is Optional.  Use if you also want to remove leading spaces.
    SELECT LTRIM(RTRIM(O.Value))[Value]
  ) AS T--T is for Trimmed.
  CROSS APPLY
  ( --From @CadeRoux's Post.
    SELECT SUBSTRING(O.Value, PATINDEX('%[^0]%', O.Value + '.'), LEN(O.Value))[Value],
           SUBSTRING(T.Value, PATINDEX('%[^0]%', T.Value + '.'), LEN(T.Value))[TrimmedValue]
  ) AS Parsed

결과 :

MikeTeeVee_SQL_Server_Remove_Leading_Zeros

요약:

You could use what I have above for a one-off removal of leading-zero's.
If you plan on reusing it a lot, then place it in an Inline-Table-Valued-Function (ITVF).
Your concerns about performance problems with UDF's is understandable.
However, this problem only applies to All-Scalar-Functions and Multi-Statement-Table-Functions.
Using ITVF's is perfectly fine.

I have the same problem with our 3rd-Party database.
With Alpha-Numeric fields many are entered in without the leading spaces, dang humans!
This makes joins impossible without cleaning up the missing leading-zeros.

Conclusion:

Instead of removing the leading-zeros, you may want to consider just padding your trimmed-values with leading-zeros when you do your joins.
Better yet, clean up your data in the table by adding leading zeros, then rebuilding your indexes.
I think this would be WAY faster and less complex.

SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF(' 0A10  ', ''))), 10)--0000000A10
SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF('', ''))), 10)--NULL --When Blank.

Instead of a space replace the 0's with a 'rare' whitespace character that shouldn't normally be in the column's text. A line feed is probably good enough for a column like this. Then you can LTrim normally and replace the special character with 0's again.


The following will return '0' if the string consists entirely of zeros:

CASE WHEN SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) = '' THEN '0' ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) END AS str_col

This makes a nice Function....

DROP FUNCTION [dbo].[FN_StripLeading]
GO
CREATE FUNCTION [dbo].[FN_StripLeading] (@string VarChar(128), @stripChar VarChar(1))
RETURNS VarChar(128)
AS
BEGIN
-- http://stackoverflow.com/questions/662383/better-techniques-for-trimming-leading-zeros-in-sql-server
    DECLARE @retVal VarChar(128),
            @pattern varChar(10)
    SELECT @pattern = '%[^'+@stripChar+']%'
    SELECT @retVal = CASE WHEN SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) = '' THEN @stripChar ELSE SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) END
    RETURN (@retVal)
END
GO
GRANT EXECUTE ON [dbo].[FN_StripLeading] TO PUBLIC

cast(value as int) will always work if string is a number


My version of this is an adaptation of Arvo's work, with a little more added on to ensure two other cases.

1) If we have all 0s, we should return the digit 0.

2) If we have a blank, we should still return a blank character.

CASE 
    WHEN PATINDEX('%[^0]%', str_col + '.') > LEN(str_col) THEN RIGHT(str_col, 1) 
    ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col + '.'), LEN(str_col))
 END

replace(ltrim(replace(Fieldname.TableName, '0', '')), '', '0')

The suggestion from Thomas G worked for our needs.

The field in our case was already string and only the leading zeros needed to be trimmed. Mostly it's all numeric but sometimes there are letters so the previous INT conversion would crash.


If you are using Snowflake SQL, might use this:

ltrim(str_col,'0')

The ltrim function removes all instances of the designated set of characters from the left side.

So ltrim(str_col,'0') on '00000008A' would return '8A'

And rtrim(str_col,'0.') on '$125.00' would return '$125'


SELECT CAST(CAST('000000000' AS INTEGER) AS VARCHAR)

This has a limit on the length of the string that can be converted to an INT


If you do not want to convert into int, I prefer this below logic because it can handle nulls IFNULL(field,LTRIM(field,'0'))


Try this:

replace(ltrim(replace(@str, '0', ' ')), ' ', '0')

참고URL : https://stackoverflow.com/questions/662383/better-techniques-for-trimming-leading-zeros-in-sql-server

반응형