Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
623 views
in Technique[技术] by (71.8m points)

cocoa touch - Guess encoding when creating an NSString from NSData

When reading an NSString from a file I can use initWithContentsOfFile:usedEncoding:error: and it will guess the encoding of the file.

When I create it from an NSData though my only option is initWithData:encoding: where I have to explicitly pass the encoding. How can I reliably guess the encoding when I work with NSData instead of files?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In iOS 8 and OS X 10.10 there is a new API on NSString:

Objective-C

+ (NSStringEncoding)stringEncodingForData:(NSData *)data
                          encodingOptions:(NSDictionary *)opts
                          convertedString:(NSString **)string
                      usedLossyConversion:(BOOL *)usedLossyConversion;

Swift

open class func stringEncoding(for data: Data,
                   encodingOptions opts: [StringEncodingDetectionOptionsKey : Any]? = nil, 
                 convertedString string: AutoreleasingUnsafeMutablePointer<NSString?>?, 
                    usedLossyConversion: UnsafeMutablePointer<ObjCBool>?) -> UInt

Now you can let the framework do the guess and in my experience that works really well!

From the header (the documentation does not state the method at the moment but it was officially mentioned in WWDC Session 204 (page 270):

  1. an array of suggested string encodings (without specifying the 3rd option in this list, all string encodings are considered but the ones in the array will have a higher preference; moreover, the order of the encodings in the array is important: the first encoding has a higher preference than the second one in the array)
  2. an array of string encodings not to use (the string encodings in this list will not be considered at all)
  3. a boolean option indicating whether only the suggested string encodings are considered
  4. a boolean option indicating whether lossy is allowed
  5. an option that gives a specific string to substitude for mystery bytes
  6. the current user's language
  7. a boolean option indicating whether the data is generated by Windows

If the values in the dictionary have wrong types (for example, the value of NSStringEncodingDetectionSuggestedEncodingsKey is not an array), an exception is thrown.

If the values in the dictionary are unknown (for example, the value in the array of suggested string encodings is not a valid encoding), the values will be ignored.

Example (Swift):

var convertedString: NSString?
let encoding = NSString.stringEncoding(for: data, encodingOptions: nil, convertedString: &convertedString, usedLossyConversion: nil)

If you just want the decoded string and don't care about the encoding you can remove the let encoding =


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...