Bokep
- Viewed 3k times1answered Mar 28, 2013 at 14:04
Looks like a bug in PyPDF2. In this section:
if string.startswith(codecs.BOM_UTF16_BE):retval = TextStringObject(string.decode("utf-16"))retval.autodetect_utf16 = Trueit assumes that any string starting with (0xFE, 0xFF) can be decoded as UTF-16. Your file contains a bytestring that begins that way but then contains invalid UTF-16.
The simplest fix is to comment out that if and unconditionally use the # This is probably a big performance hit here branch.
Content Under CC-BY-SA license Explore further
WebJan 12, 2018 · UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 84-85: illegal UTF-16 surrogate. 👍 3. Contributor. v-chojas commented Jan 12, 2018. When it does work, do you occasionally see …
- People also ask
RFC 2781: UTF-16, an encoding of ISO 10646 - RFC Editor
WebAug 19, 2011 · Note: UTF-16 does covers All Unicode as Unicode Consortium decided that 10FFFF is the TOP range of Unicode and defined UTF-8 maximal 4 bytes length and explicitly excluded range 0xD800 …
Endcoding Errors - Python Help - Discussions on Python.org
FAQ - UTF-8, UTF-16, UTF-32 & BOM - Unicode
'utf-16-le' codec can't decode bytes in position 0-1: illegal …
Python pyodbc utf-16-le error - Dremio
Issue 12892: UTF-16 and UTF-32 codecs should reject (lone
Surrogates and Supplementary Characters - Win32 apps
s = s.decode("utf16") - UnicodeDecodeError: 'utf-16-le' codec can't ...
'utf-16-le' codec can't decode bytes while reading EXCEL in …
How to Create a UTF-16 Surrogate Pair by Hand, with Python
illegal UTF-16 surrogate · Issue #438 · simonw/sqlite-utils
pyODBC + unixodbc + Db2 for iSeries = UnicodeDecodeError, …
"illegal UTF-16 surrogate" exception when listPath called #185