StackOverflow Adventures: DocumentDb and inheritance
I stumbled upon this question on StackOverflow:
Here's the gist of it: OP wanted to store instances of different descendants of a class in DocumentDb and then read them back — but it wasn't working. So what's the problem?
Inheritance, Json, Json.NET
Now unfortunately the DocumentDb SDK is not open-source. But I'm guessing it uses Json.NET to create the Json document that is submitted to DocumentDb. How do I know that? Well, for one, Json.NET is one of the dependencies of the DocumentDb SDK. And let's be honest, why would anyone (Microsoft especially) use anything else?
So now that we have cleared that up, let's look at a simplified scenario. Forget about DocumentDb for a bit and just look at this structural example:
public class Vehicle
{
public string Id { get; set; }
}
public class Car : Vehicle
{
public string CarProperty { get; set; }
}
public class Truck : Vehicle
{
public string TruckProperty { get; set; }
}
So I have a Vehicle class, with two descendants, Car and Truck. Let's create a list of vehicles and try to serialize it:
var vehicles = new List<Vehicle>
{
new Car {
Id=Guid.NewGuid().ToString(),
CarProperty = Guid.NewGuid().ToString()
},
new Truck {
Id=Guid.NewGuid().ToString(),
TruckProperty = Guid.NewGuid().ToString() },
};
string s = JsonConvert.SerializeObject(vehicles);
This yields the following json:
[{
"CarProperty": "b3268e04-e201-4dcf-a159-af28d5b62d4f",
"Id": "f84fbb7a-548a-4081-a10d-cc43e42cbb4c"
}, {
"TruckProperty": "e6b47210-f021-43ff-8c44-a8f09036d516",
"Id": "5f37ca24-210e-43d6-b27d-d6710d70ddd3"
}]
Seems OK. So let's try to deserialize the content:
var vehicles2 = JsonConvert.DeserializeObject<List<Vehicle>>(s);
If you go and check the debugger, you'll realize that unfortunately this is not what you want (most probably):
Both the objects are of type Vehicle, but you wanted a Car and a Truck. Also, the properties specific to Car and Truck are lost during deserialization.
How can this be helped? Well, you can inform Json.NET to handle types as well. You have to use the TypenameHandling setting, which is designed for this exact purpose.
Before serialization, set the TypenameHandling to Auto in the global Json.NET configuration:
var vehicles = new List<Vehicle>
{
new Car {
Id=Guid.NewGuid().ToString(),
CarProperty = Guid.NewGuid().ToString()
},
new Truck {
Id=Guid.NewGuid().ToString(),
TruckProperty = Guid.NewGuid().ToString() },
};
JsonConvert.DefaultSettings =
() => new JsonSerializerSettings { TypeNameHandling = TypeNameHandling.Auto };
string s = JsonConvert.SerializeObject(vehicles);
If you check the Json again, you'll see that it is a bit different: the type name is included for the objects:
[{
"$type": "DocumentDB.GetStarted.Program+Car, DocDBGetStarted",
"CarProperty": "17ad43ed-5788-4c7c-bb51-74306242b37d",
"Id": "a35997a6-98bd-4b34-9b42-24e3fa2e45c2"
}, {
"$type": "DocumentDB.GetStarted.Program+Truck, DocDBGetStarted",
"TruckProperty": "18765705-9e89-4b06-a3d0-3f827c6b8655",
"Id": "5d19175a-3502-49d1-a6a2-6cc4094edc12"
}]
And this does the trick. If you go ahead and do the deserialization now, the correct types will be added to the list, with all the properties. Awesome! So with that in mind, let's try to upload Json documents with this setting on, and then query them.
Adding DocumentDb into the mix
So let's try the whole thing in DocumentDb. Before we go ahead, let's modify the Vehicle class a bit. Decorate the Id property with an attribute:
public class Vehicle
{
[JsonProperty(PropertyName = "id")]
public string Id { get; set; }
}
And here's the source code (based on Microsoft's sample) to create a database and a collection (if not exists), add two documents, one Car and one Truck, and then query all the data back into a list (creating the database and the collection are intentionally omitted, that's not interesting for us right now; you can check out the full source at the end of the post). Note that before creating the documents, the global json configuration is set:
private async Task GetStartedDemo()
{
string databaseName = "inheritanceTest";
string collectionName = "vehicles";
this.client = new DocumentClient(new Uri(EndpointUri), PrimaryKey);
// create database
// create collection
Vehicle v1 = new Car { CarProperty = Guid.NewGuid().ToString() };
Vehicle v2 = new Truck { TruckProperty = Guid.NewGuid().ToString() };
JsonConvert.DefaultSettings = () => new JsonSerializerSettings { TypeNameHandling = TypeNameHandling.All };
await this.client.CreateDocumentAsync(UriFactory.CreateDocumentCollectionUri(databaseName, collectionName), v1);
await this.client.CreateDocumentAsync(UriFactory.CreateDocumentCollectionUri(databaseName, collectionName), v2);
IQueryable<Vehicle> vehicleQuery = this.client.CreateDocumentQuery<Vehicle>(UriFactory.CreateDocumentCollectionUri(databaseName, collectionName));
var results = vehicleQuery.ToList();
}
And this really does work! Awesome... almost.
If you look closely, you can see that instead of Auto, I used All. I still don't know why that's needed. Probably the internal serialization mechanism which DocumentDb uses circumvents the Auto option.
And to be honest, I'm not really a fan of the global json configuration. Now all your data will contain the type, which might not be what you want. But I haven't found any way to specify for the DocumentDb SDK to use custom serializersettings, and the SDK is not open source, so I guess we're stuck with this solution right now.
Of course another option is to create the Json yourself using Json.NET and a custom serializer object with custom serializationsettings, and then use the REST API of DocumentDb to post the documents. But implementing that (and the same for the deserialization) might be more work than what you can afford. Also, isn't that what the SDK is for? So I'd say good enough.
Check out the sample source code on Github.