Yahoo Developer Network

Latest Blogs

October 10, 2022

Moving from Mantle to Swift for JSON Parsing

<p>We recently converted one of our internal libraries from all Objective-C to all Swift. Along the way, we refactored how we parse JSON, moving from using the third-party <em>Mantle</em> library to the native JSON decoding built into the Swift language and standard library.</p> <p>In this post, I'll talk about the motivation for converting, the similarities and differences between the two tools, and challenges we faced, including:</p><ul><li> <p><a href="#NestedJSONTypes">Handling nested JSON objects</a></p></li><li> <p><a href="#JSONObjectsofUnknownType">Dealing with JSON objects of unknown types</a></p></li><li> <p><a href="#IncrementalConversion">Performing an incremental conversion</a></p></li><li> <p><a href="#Bridging">Continuing to support Objective-C users</a></p></li><li> <p><a href="#Failures">Dealing with failures</a> </p></li></ul><h2><span>Introduction</span></h2> <p><a href="https://www.swift.org">Swift</a> is Apple's modern programming language for building applications on all of their platforms. Introduced in June 2014, it succeeds Objective-C, an object-oriented superset of the C language from the early 80's. The design goals for Swift were similar to a new crop of modern languages, such as Rust and Go, that provide a safer way to build applications, where the compiler plays a larger role in enforcing correct usage of types, memory access, collections, nil pointers, and more.</p> <p>At Yahoo, adoption of Swift started slow, judiciously waiting for the language to mature. But in the last few years, Swift has become the primary language for new code across the company. This is important not only for the safety reasons mentioned, but also for a better developer experience. Many that started developing for iOS after 2014 have been using primarily Swift, and it's important to offer employees modern languages and codebases to work in. In addition to new code, the mobile org has been converting existing code when possible, both in apps and SDK's.</p> <p>One recent migration was the MultiplexStream SDK. MultiplexStream is an internal library that fetches, caches, and merges streams of content. There is a subspec of the library specialized to fetch streams of Yahoo news articles and convert the returned JSON to data models.</p><p>During a Swift conversion, we try to avoid any refactoring or re-architecting, and instead aim for a line-for-line port. Even a one-to-one translation can introduce new bugs, and adding a refactor at the same time is risky. But sometimes rewriting can be unavoidable.</p> <h2>JSON Encoding and Decoding</h2> <p>The Swift language and its standard library have evolved to add features that are practical for application developers. One addition is native JSON encoding and decoding support. Creating types that can be automatically encoded and decoded from JSON is a huge productivity boost.</p><p>Previously, developers would either manually parse JSON or use a third-party library to help reduce the tedious work of unpacking values, checking types, and setting the values on native object properties.</p><h2>Mantle</h2><p>MultiplexStream relied on the third-party Mantle SDK to help with parsing JSON to native data model objects. And Mantle is great -- it has worked well in a number of Yahoo apps for a long time.</p><p>However, Mantle relies heavily on the dynamic features of the Objective-C language and runtime, which are not always available in Swift, and can run counter to the static, safe, and strongly-typed philosophy of Swift. In Objective-C, objects can be dynamically cast and coerced from one type to another. In Swift, the compiler enforces strict type checking and type inference, making such casts impossible. In Objective-C, methods can be called on objects at runtime whether they actually respond to them or not. In Swift, the compiler ensures that types will implement methods being called. In Objective-C, collections, such as Arrays and Dictionaries, can hold any type of object. In Swift, collections are homogeneous and the compiler guarantees they will only hold values of a pre-declared type.</p><p>For example, in Objective-C, every object has a <span style="color: rgb(102,102,153);"><code>-(id)getValueForKey:(NSString*)key</code></span> method that, given a string matching a property name of the object, returns the value for the property from the instance.</p><p>But two things can go wrong here:</p><ol><li>The string may not reference an actual property of the object. This crashes at runtime.</li><li>Notice the <code>id</code> return type. This is the generic "could be anything" placeholder. The caller must cast the <code>id</code> to what they expect it to be. But if you expect it to be a string, yet somehow it is a number instead, calling string methods on the number will crash at runtime.</li></ol> <p>Similarly, every Objective-C object has a <span style="color: rgb(102,102,153);"><code>-(void)setValue:(id)value, forKey:(NSString*)key</code></span> method that, again, takes a string property name and an object of any type. But use the wrong string or wrong value type and, again, boom.</p><p>Mantle uses these dynamic Objective-C features to support decoding from JSON payloads, essentially saying, "provide me with the string keys you expect to see in your JSON, and I'll call <code>setValueForKey</code> on your objects for each value in the JSON." Whether it is the type you are expecting is another story.</p><p>Back-end systems work hard to fulfill their API contracts, but it isn't unheard of in a JSON object to receive a string instead of a float. Or to omit keys you expected to be present.</p><p>Swift wanted to avoid these sorts of problems. Instead, the deserialization code is synthesized by the compiler at compile time, using language features to ensure safety.</p> <h2 id="NestedJSONTypes">Nested JSON Types</h2> <p>Our primary data model object, <code>Article</code>, represents a news article. Its API includes all the things you might expect, such as:</p> <p>Public interface:</p><table><tr><td><pre>class Article { var id: String var headline: String var author: String var imageURL: String }</pre></td></tr></table><pre><code> </code></pre><p>The reality is that these values come from various objects deeply nested in the JSON object structure.</p><p>JSON:</p><table><tr><td><pre>{ "id": "1234", "content": { "headline":"Apple Introduces Swift Language", "author": { "name":"John Appleseed", "imageURL":"..." }, "image": { "url":"www..." } } }</pre></td></tr></table><pre><code> </code></pre><p>In Mantle, you would supply a dictionary of keypaths that map JSON names to property names:</p><table><tr><td><pre>{ "id":"id", "headline":"content.headline", "author":"content.author.name", "imageURL":"content.image.url" }</pre></td></tr></table><p class="auto-cursor-target"><br /></p><pre><br /></pre><pre><br /></pre><p>In Swift, you have multiple objects that match 1:1 the JSON payload:</p><table><tr><td><pre>class Article: Codable { var id: String var content: Content } class Content: Codable { var headline: String var author: Author var image: Image } class Author: Codable { var name: String var imageURL: String } class Image: Codable { var url: String }</pre></td></tr></table><pre><br /></pre><p>We wanted to keep the <code>Article</code> interface the same, so we provide computed properties to surface the same API and handle the traversal of the object graph:<code> </code></p><table><tr><td><pre>class Article { var id: String private var content: Content var headline: String { content.headline } var author: String { content.author.name } var imageURL: String { content.image.url } }</pre></td></tr></table><p>This approach increases the number of types you create, but gives a clearer view of what the entities look like on the server. But for the client, the end result is the same: Values are easy to access on the object, abstracting away the underlying data structure.</p> <h2 id="JSONObjectsofUnknownType">JSON Objects of Unknown Type</h2><p>In a perfect world, we know up front the keys and corresponding types of every value we might receive from the server. However, this is not always the case.</p><p>In Mantle, we can specify a property to be of type <code>NSDictionary</code> and call it a day. We could receive a dictionary of <code>[String:String]</code>, <code>[String:NSNumber]</code>, or even <code>[String: NSDictionary]</code>.</p><p>Using Swift’s JSON decoding, the types need to be specified up front. If we say we expect a Dictionary, we need to specify "a dictionary of what types?"</p><p>Others have faced this problem, and one of the solutions that has emerged in the Swift community is to create a type that can represent any type of JSON value.</p><p>Your first thought might be to write a Dictionary of <code>[String:Any]</code>. But for a Dictionary to be Codable, its keys and values must also be Codable. <code>Any</code> is not Codable: it could be a UIView, which clearly can't be decoded from JSON. So instead we want to say, “we expect any type that is itself Codable.” Unfortunately there is no AnyCodable type in Swift. But we can write our own!</p><p>There are a finite number of types the server can send as JSON values. What is good for representing finite choices in Swift? Enums. Let’s model those cases first:</p><table><tr><td><pre>enum AnyDecodable { case int case float case bool case string case array case dictionary case none }</pre></td></tr></table><pre><code> </code></pre><p>So we can say we expect a Dictionary of String: AnyDecodable. The enum case will describe the type that was in the field. But what is the actual value?</p><p>Enums in Swift can have associated values! So now our enum becomes:</p><table><tr><td><pre>enum AnyDecodable { case int(Int) case float(Float) case bool(Bool) case string(String) case array([AnyDecodable]) case dictionary([String:AnyDecodable]) case none }</pre></td></tr></table><pre><code> </code></pre><p>We're almost done. Just because we have described what we would like to see, doesn't mean the system can just make it happen. We're outside the realm of automatic synthesis here. We need to implement the manual encode/decode functions so that when the JSONDecoder encounters a type we've said to be <code>AnyDecodable</code>, it can call the encode or decode method on the type, passing in what is essentially the untyped raw data:</p><p><br /></p><table><tr><td><pre>extension AnyDecodable: Codable { init(from decoder: Decoder) throws { let container = try decoder.singleValueContainer() if let int = try? container.decode(Int.self) { self = .int(int) } else if let string = try? container.decode(String.self) { self = .string(string) } else if let bool = try? container.decode(Bool.self) { self = .bool(bool) } else if let float = try? container.decode(Float.self) { self = .float(float) } else if let array = try? container.decode([AnyDecodable].self) { self = .array(array) } else if let dict = try? container.decode([String:AnyDecodable].self) { self = .dictionary(dict) } else { self = .none } } func encode(to encoder: Encoder) throws { var container = encoder.singleValueContainer() switch self { case .int(let int): try container.encode(int) case .float(let float): try container.encode(float) case .bool(let bool): try container.encode(bool) case .string(let string): try container.encode(string) case .array(let array): try container.encode(array) case .dictionary(let dictionary): try container.encode(dictionary) case .none: try container.encodeNil() } } }</pre></td></tr></table><pre><code> </code></pre><p>We've implemented functions that, at runtime, can deal with a value of unknown type, test to find out what type it actually is, and then associate it into an instance of our AnyDecodable type, including the actual value.</p><p>We can now create a Codable type such as:</p><table><tr><td><pre>struct Article: Codable { var headline: String var sportsMetadata: AnyDecodable }</pre></td></tr></table><pre><code> </code></pre><p>In our use case, as a general purpose SDK, we don't know much about <code>sportsMetadata</code>. It is a part of the payload defined between the Sports app and their editorial staff.</p><p>When the Sports app wants to use the <code>sportsMetadata</code> property, they must switch over it and unwrap the associated value. So if they expect it to be a String:</p><table><tr><td><pre>switch article.metadata { case .string(let str): label.text = str default: break }</pre></td></tr></table><pre><code> </code></pre><p>Or using "<a href="https://goshdarnifcaseletsyntax.com">if case let</a>" syntax:</p><table><tr><td><pre>if case let AnyDecodable.string(str) = article.metadata { label.text = str }</pre></td></tr></table> <h2 id="IncrementalConversion">Incremental Conversion</h2><p>During conversion it was important to migrate incrementally. Pull requests should be fairly small, tests should continue to run and pass, build systems should continue to verify building on all supported platforms in various configurations.</p><p>We identified the tree structure of the SDK and began converting the leaf nodes first, usually converting a class or two at a time.</p><p>But for the data models, converting the leaf nodes from using Mantle to Codable was not possible. You cannot easily mix the two worlds: specifying a root object as Mantle means all of the leaves need to use Mantle also. Likewise for Codable objects.</p><p>Instead, we created a parallel set of Codable models with an <code>_Swift</code> suffix, and as we added them, we also added unit tests to verify our work in progress. Once we finished creating a parallel set of objects, we deleted the old objects and removed the Swift suffix from the new. Because the public API remained the same, the old tests didn’t need to change.</p> <h2 id="Bridging">Bridging</h2><p>Some Swift types cannot be represented in Objective-C:</p><table><tr><td><pre>@objcMembers class Article: NSObject { ... var readTime: Int? }</pre></td></tr></table><pre><code> </code></pre><p>Bridging the Int to Obj-C results in a value type of NSInteger. But optionality is expressed in Objective-C with nil pointers, and only NSObjects, as reference types, have pointers.</p><p>So the existing Objective-C API might look like this:</p><table><tr><td><pre>@property (nonatomic, nullable, strong) NSNumber *readTime; </pre></td></tr></table><pre><code> </code></pre><p>Since we can't write <code>var readTime: Int?</code>, and <code>NSNumber</code> isn't Codable, we can instead write a computed property to keep the same API:</p><table><tr><td><pre>@objcMembers class Article: NSObject { private var _readTime: Int? public var readTime: NSNumber? { if let time = _readTime { return NSNumber(integerLiteral: time) } else { return nil } } }</pre></td></tr></table><pre><code> </code></pre><p>Lastly, we need to let the compiler know to map our private <code>_readTime</code> variable to the <code>readTime</code> key in the JSON dictionary. We achieve this using CodingKeys:</p><table><tr><td><pre>@objcMembers class Article: NSObject { private var _readTime: Int? public var readTime: NSNumber? { if let time = _readTime { return NSNumber(integerLiteral: time) } else { return nil } } enum CodingKeys: String, CodingKey { case _readTime = "readTime" ... } }</pre></td></tr></table> <h2 id="Failures">Failures</h2><p>Swift's relentless focus on safety means there is no room for error. An article struct defined as having a non-optional headline must have one. And if one out of 100 articles in a JSON response is missing a headline, the entire parsing operation will fail.</p><p>People may think (myself included), "just omit the one article that failed." But there are cases where the integrity of the data falls apart if it is incomplete. A bank account payload that states a balance of $100, yet the list of transactions sums to $99 because we skipped one that didn't have a location field, would be a bad experience.</p><p>The solution here is to mark fields that may or may not be present as optional. It can lead to messier code, with users constantly unwrapping values, but it better reflects the reality that fields can be missing.</p><p>If a type declares an article identifier to be an integer, and the server sends a String instead, the parsing operation will throw an error. Swift will not do implicit type conversion.</p><p>The good news is that these failures do not crash, but instead throw (and provide excellent error diagnostics about what went wrong).</p><h2>Conclusion</h2><p>A conversion like this really illustrates some of the fundamental differences between Objective-C and Swift. While some things may appear to be easier in Objective-C, such as dealing with unknown JSON types, the cost is in sharp edges that can cut in production. I do not mind paying a bit more at development time to save in the long run.</p><p>The unit tests around our model objects were a tremendous help. Because we kept the same API, once the conversion was complete, they verified everything worked as before. These tests used static JSON files of server responses and validated our objects contained correct values.</p><p>The Swift version of MultiplexStream shipped in the Yahoo News app in April 2022. So far, no one has noticed (which was the goal). But hopefully the next developer that goes in to work on MultiplexStream will.</p><h2>Resources</h2><p><a href="https://developer.apple.com/documentation/foundation/archives_and_serialization/encoding_and_decoding_custom_types">Apple Article on Encoding and Decoding Custom Types</a></p><p><a href="https://developer.apple.com/documentation/swift/migrating_your_objective-c_code_to_swift">Apple Migration Doc</a></p><p><a href="https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/importing_objective-c_into_swift">Obj-C to Swift Interop</a></p><p><a href="https://developer.apple.com/documentation/swift/imported_c_and_objective-c_apis/importing_swift_into_objective-c">Swift to Obj-C Interop</a></p><h2>Author</h2><h4>Jason Howlin</h4><p>Senior Software Mobile Apps Engineer</p>

swift