python

깨진 한글 인코딩 찾기

장곰부대 2018. 3. 7. 01:45

https://stackoverflow.com/questions/1728376/get-a-list-of-all-the-encodings-python-can-encode-to


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import requests
import lxml.html
 
text= "±¸·ç±¸·ç+01(v7).zip"
textencodings = [] 
 
url = 'https://docs.python.org/3.6/library/codecs.html#standard-encodings'
html = requests.get(url).text
doc = lxml.html.fromstring(html)
standard_encodings_table = doc.xpath(
        '//table[preceding::h2[.//text()[contains(., "Standard Encodings")]]][//th/text()="Codec"]'
    )[0]
textencodings = standard_encodings_table.xpath('.//td[1]/text()')
    
for x in textencodings:
    for y in textencodings:
        try:
            print(text.encode(x).decode(y),x,y)
        except:
            pass
cs

그냥 무식하게 돌려서 찾아내기~


썩섺스