Add Chinese, Japanese, Korean Name support #734

LightWind1 · 2023-08-13T05:25:08Z

#376
I add three regular expression to match Chinese, Japanese, Korean words .
Now it can tokenize sql correctly like 'select T2.名称 , T2.南北区域 from 民风彪悍十大城市 as T1 join 省份 as T2 on 民风彪悍十大城市.所属省份id == 省份.词条id group by T1.所属省份id order by count ( * ) asc limit 3'

andialbrecht · 2024-03-16T15:19:24Z

Hi @LightWind1, can you clarify what problem your change solves?
I've had a look on how the parser sees your statement and to me everything looks as expected:

import sqlparse
sql = 'select T2.名称 , T2.南北区域 from 民风彪悍十大城市 as T1 join 省份 as T2 on 民风彪悍十大城市.所属省份id == 省份.词条id group by T1.所属省份id order by count ( * ) asc limit 3'
p = sqlparse.parse(sql)[0]
p._pprint_tree()
|- 0 DML 'select'
|- 1 Whitespace ' '
|- 2 IdentifierList 'T2.名称 ...'
|  |- 0 Identifier 'T2.名称'
|  |  |- 0 Name 'T2'
|  |  |- 1 Punctuation '.'
|  |  `- 2 Name '名称'
|  |- 1 Whitespace ' '
|  |- 2 Punctuation ','
.....and so on.....

Add Chinese, Japanese, Korean name support

3501bb1

andialbrecht self-assigned this Mar 5, 2024

andialbrecht added the Needs Feedback label Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Chinese, Japanese, Korean Name support #734

Add Chinese, Japanese, Korean Name support #734

LightWind1 commented Aug 13, 2023

andialbrecht commented Mar 16, 2024

Add Chinese, Japanese, Korean Name support #734

Are you sure you want to change the base?

Add Chinese, Japanese, Korean Name support #734

Conversation

LightWind1 commented Aug 13, 2023

andialbrecht commented Mar 16, 2024