霸霸,string str = File.ReadAllText("BD-6-B-8.nc1")不指定编码,默认使用系统编码,用File.WriteAllText("8.txt",str)写入,出现中文乱码,string str = File.ReadAllText("BD-6-B-8.nc1",Encoding.Default);Encoding.Default表示使用系统编码,用File.WriteAllText("8.txt",str);写入,又没有出现中文乱码,这是为什么?
首先 string str = File.ReadAllText("BD-6-B-8.nc1")不指定编码,默认使用系统编码 这句话就是错的,好好看 API 文档比啥都强:
This method opens a file, reads all the text in the file, and returns it as a string. It then closes the file.
This method attempts to automatically detect the encoding of a file based on the presence of byte order marks. It automatically recognizes UTF-8, little-endian UTF-16, big-endian UTF-16, little-endian UTF-32, and big-endian UTF-32 text if the file starts with the appropriate byte order marks.
This method uses UTF-8 encoding without a Byte-Order Mark (BOM), so using the GetPreamble method will return an empty byte array.
Different computers can use different encodings as the default, and the default encoding can even change on a single computer. Therefore, data streamed from one computer to another or even retrieved at different times on the same computer might be translated incorrectly. In addition, the encoding returned by the Default property uses best-fit fallback to map unsupported characters to characters supported by the code page. For these two reasons, using the default encoding is generally not recommended. To ensure that encoded bytes are decoded properly, your application should use a Unicode encoding, such as UTF8Encoding or UnicodeEncoding, with a preamble. Another option is to use a higher-level protocol to ensure that the same format is used for encoding and decoding.
还有问问题不要喊爹喊娘,正经问就行了。
文件读取和写入时,编码的选择对处理中文等非 ASCII 字符的正确性至关重要。你遇到的问题与系统的默认编码及具体编码之间的差异有关。下面是对你情况的详细分析:
File.ReadAllText("BD-6-B-8.nc1")
方法在不指定编码时使用 系统的默认编码。这个编码通常由操作系统设置,可能是 UTF-8
、GBK
、ANSI
等,具体取决于你使用的操作系统和地区设置。BD-6-B-8.nc1
中的中文字符是用某种特定编码(例如 UTF-8
或 GBK
)保存的,而你的系统默认编码与该编码不匹配,读取时就会导致乱码。File.WriteAllText("8.txt", str)
时,如果 str
的内容来源于 ReadAllText
的结果,而这个内容是因为编码不匹配而导致的乱码,那么写入的 8.txt
文件将包含错误的字符表现。Encoding.Default
显式指定编码 File.ReadAllText("BD-6-B-8.nc1", Encoding.Default)
时,读取和写入的编码一致,这样就不会发生乱码了。如果你希望确保无论何时都可正确处理中文字符,建议使用 明确的字符编码,例如:
string str = File.ReadAllText("BD-6-B-8.nc1", Encoding.UTF8);
File.WriteAllText("8.txt", str, Encoding.UTF8);
GBK
编码保存的,可以进行如下操作:string str = File.ReadAllText("BD-6-B-8.nc1", Encoding.GetEncoding("GBK"));
File.WriteAllText("8.txt", str, Encoding.GetEncoding("GBK"));
使用 Encoding.Default
时,如果系统默认编码与文件的实际编码不匹配,就会出现乱码。而明确指定编码则能确保读取与写入都使用同样的方式来处理文本,避免乱码问题。因此,强烈建议在处理涉及中文字符的文件时,始终使用明确的编码,以确保兼容性及可读性。