Proper storage is about creating a home for something so that minimal effort is required to find it and put it away.
—Geralin Thomas, Organizing Consultant
Winning with Records
Record types are a simple way of recording small groups of values. You define a set of names and corresponding types; then you can create, compare, and amend instances of these groupings with some extremely simple syntax. But behind this simplicity lies some powerful and well-thought-out functionality. Learn to wield record types effectively and you’ll be well on the way to becoming an expert F# developer. It’s also worth knowing when not to use record types and what the alternatives are in these circumstances. We’ll cover both explicitly declared named record types and also implicitly declared anonymous record types.
Record Type Basics
Declaring a record type
Instantiating record type instances
Note that at instantiation time, you don’t have to mention the name of the record type itself, just its fields. The exception to this is when two record types have field names in common, in which case you may have to prefix the first field name in the binding with the name record type you want, for example, { FileDescription.Path = ... .
Accessing record type fields using dot notation
Record Types and Immutability
Declaring a record instance as mutable
Declaring record fields as mutable
“Amending” a record using copy and update
In a copy-and-update operation, all the fields of the new record are given the values from the original record, except those given new values in the with clause. Needless to say, the original record is unaffected. This is the idiomatic way to handle “changes” to record type instances.
Default Constructors, Setters, and Getters
One downside to immutability by default: you may occasionally have problems with external code (particularly serialization and database code) failing to instantiate record types correctly, or throwing compilation errors about default constructors. In these cases, simply add the [<CLIMutable>] attribute to the record declaration. This causes the record to be compiled with a default constructor and getters and setters, which the external framework should find easier to cope with.
Records vs. Classes
F# Object-Oriented class types vs. records
We’ll look properly at classes in Chapter 8, but it’s fairly easy to see what is going on here. The class we make is even immutable. So do we really need to bother with record types? In the next few sections, I’ll discuss some of the advantages (and a few disadvantages!) of using record types.
Structural Equality by Default
Representing latitude and longitude using a class
Some types are less equal than others
This is because classes in both F# and C# have what is called reference or referential equality by default, which means that to be considered equal, two values need to represent the same physical object in memory. Sometimes, as in the LatLon example, this is very much not what you want.
The conventional way around this in C# (and you can do the same for classes in F#) is to write custom code that decides whether two instances are equal in some meaningful sense. The trouble is in practice this is quite an endeavor, requiring you to override Object.Equals, implement System.IEquatable, override Object.GetHashCode, and (admittedly optionally) override the equality and inequality operators. Who has time for all that? (I will show how to do it in Chapter 8, just in case you do have time!)
Default structural (content) equality with record types
Do all the fields of your record implement the right equality?
Because they use different instances of the Surveyor class, the instances waterloo and waterloo2 aren’t considered equal, even though from a content point of view, the surveyors have the same name. If we had created one Surveyor instance in advance and used that same instance when creating each of the LatLon instances, waterloo and waterloo2 would have been equal again! The general solution to this would be either to use a record for the Surveyor type or override the Surveyor equality-checking logic. Although worth bearing in mind, this issue rarely comes up in practice.
Forcing reference equality for record types
Once again, I can’t ever recall having to use the ReferenceEquality attribute in real code. If you do use it, remember you won’t be able to sort instances using default sorting because the attribute disables greater than/less than comparison. While we are on the subject, you can also add the NoEquality attribute to disable “equals” and “greater/less than” operations on a record type, or you can even disable “greater/less than” operations while allowing “equals” operations using the NoComparison attribute . I have seen the NoEquality attribute used precisely once in real code. Stylistically, I would say that – given what records are for – use of ReferenceEquality, NoEquality, and NoComparison attributes in general “line of business” code is probably a code smell, though they no doubt have their place in highly technical realms.
Be aware that the ReferenceEquality, NoEquality, and NoComparison attributes are all F# specific. Other languages are under no obligation to respect them (and probably won’t).
Records as Structs
Structures are value types, which means that they are stored directly on the stack or, when they are used as fields or array elements, inline in the parent type.
Marking a record type as a struct
Scenarios vary widely in regard to creating, accessing, copying, and releasing instances, so you should experiment diligently in your use case, rather than blindly assuming that using the Struct attribute will solve any performance woes.
Struct records must be mutable instances to mutate fields
Mapping from Instantiation Values to Members
You can get all the values back that you originally provided and in their original form.
You can’t get anything else back other than what you provided (unless you define members on the record type, which is possible but rare).
You can’t create an instance without providing all the necessary values.
Nothing can change the values you originally provided – unless you declare fields as mutable, which generally is unwise.
These may seem like small points, but they contribute greatly to the motivational transparency and semantic focus of your code. As an example, consider the third point: You can’t create an instance without providing all the necessary values. Contrast that with the coding pattern that any experienced OO developer has seen, where you need to both construct an object instance and set some properties in order for the object to become usable. (Any place you use object-initializer syntax to get to a usable state is an example.) The fact that, in order to create a record, you have to provide values for all its fields has an interesting consequence: if you add a field, you’ll have to make code changes everywhere that record is instantiated. This is true even if you make the field an option type – there is no concept in record instantiation of default values for fields, even ones that are option types. At first, this can seem annoying, but it is actually a very good thing. All sorts of subtle bugs can creep in if it’s possible to add a property to a type without making an explicit decision about what that property should contain, everywhere the type is used. Those compiler errors are telling you something!
Records Everywhere?
If the case for record types is so compelling, why don’t we use them everywhere? Why does F# even bother to offer OO-style class types? Are these just a concession to C# programmers, to be avoided by the cool kids?
When to Consider Not Using Record Types
Scenario | Consider instead |
---|---|
External and internal representations of data need to differ | Class types |
Need to participate in an inheritance hierarchy – either to inherit from or be inherited from in a traditional OO sense | Class types |
Need to represent a standard set of functions, with several realizations that share function names and signatures, but have different implementations | F# interfaces and/or abstract types, inherited from by class types |
Unlike code that uses interfaces, you don’t have to upcast to the interface type whenever you want to use the interface. (I give a few more details of this in Chapter 8.)
It can make it easier to use partial application when using the “pretend interface.”
It’s sometimes claimed to be more concise.
Use interface types to represent a set of operations. This is preferred to other options, such as tuples of functions or records of functions… Interfaces are first-class concepts in .NET....
In my experience, use of records-as-interfaces leads to unfriendly, incomprehensible code. When editing, one rapidly gets into the situation where everything has to compile before anything will compile. In concrete terms, your screen fills with red squiggly lines, and it’s very hard to work out what to do about it! With true interfaces, by contrast, the errors resulting from incomplete or slightly incorrect code are more contained, and it’s much easier to work out if an error results, for example, from a wrongly implemented method or from a completely missing one. Interfaces play more nicely with Intellisense as well. As for the supposed advantage of partial application – well, I’d much rather maintainers (including my future self) have some idea of what is going on than save a few characters by not repeating a couple of function parameters.
I’m not saying, by the way, that records shouldn’t implement interfaces, which they can do in exactly the same way as I show with classes in Chapter 8. If you find that useful, it’s fine.
One notable exception to what I’ve said previously is when working with “Fable Remoting.” Fable, in case you haven’t come across it, is an F#-to-JavaScript compiler that allows you to write both the back and front end of a web application in F#. This architecture requires the ability to make function calls from the front end, in the browser, to the back end - a .NET program running on the server. Without going into detail here, it turns out that describing such an interface in terms of a record-of-functions works very well in that specialized case.
Pushing Records to the Limit
Now that you’re familiar with how and when to use basic record types, it’s time to look at some of the more exotic features and usages that are available. Don’t take this section as encouragement to use all the techniques it describes. Some (not all) of these tricks really are rarities, and when it’s truly necessary to use them, you’ll know.
Generic Records
A generic record type
Note that we don’t have to specify the type to use at construction time. The simple fact that we say { Latitude = 51.5031f... versus { Latitude = 51.5031... (note the “f,” which specifies a single-precision constant) is enough for the compiler to create a record that has single-precision instead of double-precision fields. Also notice that, since waterloo and waterloo2 are different types, we can’t directly compare them using the equals operator.
Pinning down the generic parameter type of a record type
In this case, as shown in the final lines of Listing 7-16, it’s an error to try and bind a field using a value of a different type (note the missing “f” in the Longitude binding).
Recursive Records
A recursive record type
Each UiControl instance can have a parent that is itself a UiControl instance. It’s important that the recursive field (in this case, Parent ) is an option type. Otherwise, we are implying either that the hierarchy goes upward infinitely (making it impossible to instantiate) or that it is circular.
Instantiating a circular set of recursive records
Records with Methods
Anyone with an Object-Oriented programming background will be wondering whether it’s possible for records to have methods. And the answer is yes, but it may not always be a great idea.
Instance Methods
Adding an instance method to a record type
Note that the distance calculation I do here is extremely naive. In reality, you’d want to use the haversine formula, but that’s rather too much code for a book listing.
Instance methods like this work fine with record types and are quite a nice solution where you want structural (content) equality for instances and also to have instance methods to give you fluent syntax like abilene.DistanceFrom(coleman).
Static Methods
Adding a static method to a record type
This is quite a nice way of effectively adding constructors to record types. It might be especially useful it you want to perform validation during construction.
Method Overrides
Overriding a method on a record
In Listing 7-21, I’ve used the “%O” format specifier, which causes the input’s ToString() method to be called.
Records with Methods – A Good Idea?
I don’t think there is anything inherently wrong with adding methods to record types. You should just beware of crossing the line into territory where it would be better to use a class type. If you are using record methods to cover up the fact that the internal and external representations of some data do in fact need to be different, you’ve probably crossed the line!
There is an alternative way of associating behavior (functions or methods) with types (sets of data): group them in an F# module, usually with the same name as the type and placed just after the type’s definition. We looked at this back in Chapter 2, for example, in Listing 2-9, where we defined a MilesYards type, for representing British railroad distances, and a MilesYards module containing functions to work with the type. In my opinion, the modules approach is generally better than gluing the functions to the record in the form of methods.
Anonymous Records
Although the declaration syntax of F# records is about as lightweight as it could be, there are sometimes situations where even that overhead seems too much. For times like this, F# offers anonymous records. Anonymous records bring most of the benefits of “named” records – in terms of strong typing, type inference, structural (content) equality, and so forth – without the overhead of having to declare them explicitly.
Creating anonymous records
Type safety and anonymous records
Using anonymous records to clarify intermediate values
This could certainly be achieved by having the lambda in the Array.map operation return a tuple. But using an anonymous record makes very clear the roles that the three created values play: the original name, the display name with the article moved to the end, and the uppercased sort name. When we come to sort the results in the last line, it’s very clear that we are using SortName to sort on. Anything which consumed these results could also use these fields appropriately and unambiguously.
Another advantage of code like this is that when you use a debugger to pause program execution and view values, it’s much clearer which value is which. The field names in anonymous records are shown in the debugger.
Anonymous and Named Record Terminology
At this point, I need to make a brief point about terminology. The documentation for anonymous records contains the magnificent heading “Anonymous Records are Nominal,” which appears at first sight to be a contradiction in terms. What this is saying is that anonymous records have a “secret” name and so technically are “nominal,” even though we don’t give them a name in our code. For simplicity, in this section, I’m using named records to mean those declared up front with the type Name = { <field declarations> } syntax and instantiated later and anonymous record to mean those instantiated without prior declaration using the {| <field values> |} syntax.
Anonymous records seem to be almost too good to be true. Surely, there are some showstopping limitations. In practice, there are very few and indeed some things which you might expect would not work are in fact fine. Here are some points which might be worrying you about anonymous records.
Anonymous Records and Comparison
Equality and comparison of anonymous record instances
The same stipulation applies here as it does to named records: that each of the fields of the record itself has structural equality.
Anonymous records with the same names and types of fields are the same type
“Copy and Update” on Anonymous Records
Copy-and-update operations on anonymous records
Creating a new anonymous record with an additional field, based on a named record
Serialization and Deserialization of Anonymous Records
Serializing and deserializing anonymous records
Using anonymous records to deserialize JSON API results
If we are not interested in a property of the JSON, that’s fine – we just don’t mention it in the anonymous record we specify in the type parameter.
If we happen to specify fields in the anonymous record that aren’t in the JSON, we will get nulls or zeros. (If you edit one of the field names and rerun the code, you’ll see what I mean.) You will have to watch out for mistakes like this at runtime because there is no way for the compiler to spot them.
If there happens to be a clash between a property name in the JSON and a reserved word in F#, you will need to put the field name in double back quotes. We have done this with the word abstract in Listing 7-30.
Anonymous Records in Type Hints
Anonymous records in type hints
Struct Anonymous Records
Structural anonymous records
Anonymous Records and C#
From C#’s point of view, F# anonymous records look like C#’s anonymous types. If a C# caller requires an anonymous type, feel free to give it an anonymous record as your return value. More commonly, if an F# caller is calling a C# API that requires an anonymous type, you can give it an anonymous record instance.
Pattern Matching on Anonymous Records
Finally, we come to something which anonymous records don’t support! You can’t pattern match on them. If we attempt to adapt the code from Listing 6-11 in the previous chapter, we find that it doesn’t compile when anonymous records are used (Listing 7-33).
You cannot pattern match on anonymous records
Adding Methods to Anonymous Records
You cannot directly add methods to anonymous records. There are workarounds for this, but I can’t think of a reason you would do such a thing, given how much it obfuscates your code. I rarely even add methods to named records!
Mutation and Anonymous Records
You can’t declare an anonymous record with a mutable field, though again there is an open language suggestion to address this. Again, I rarely if ever have mutable fields in a named record. Having one in an anonymous record seems even more inadvisable. You can declare entire anonymous record instances as mutable, but I am hard-pressed to think of a situation where you would want to do so.
Record Layout
Use Pascal case for both record type names and for the individual field labels. All the listings in this chapter follow that approach.
Where a record type definition or instantiation doesn’t fit comfortably into a single line, break it into multiple lines, leftaligning the field labels. If you put fields on separate lines, omit the separating semicolons. Don’t mix single and multiline styles (Listing 7-34).
Use the field names in the same order in the record type definition as in any instantiations and with operations.
Good and bad record type layout
Recommendations
Prefer records to class types unless you need the internal and external representations of data to differ, or the type needs to have “moving parts” internally.
Think long and hard before making record fields or (worse still!) whole records mutable; instead, get comfortable using copy-and-update record expressions (i.e., the with keyword).
Make sure you understand the importance of “structural” (content) equality in record types, but make sure you also know when it would be violated. (When a field doesn’t itself, have content equality.)
Sometimes, it’s useful to add instance methods, static methods, or overrides to record types, but don’t get carried away: having to do this, a lot might indicate that a class type would be a better fit.
Consider putting record types on the stack with [<Struct>] if this gives you performance benefits across the whole life cycle of the instance.
Lay your record type definitions and instantiations out carefully and consistently.
Consider anonymous records where the scope of the instances you create is narrow; a few lines or at most one module or source file. Pipelines that use tuples to pass values between their stages are a particularly attractive target. If the type is more pervasive, it’s probably better to declare a named record up front.
Obviously, don’t use anonymous records if one of their shortcomings is going to force you into strange workarounds. For instance, if you need to pattern match on records, anonymous records are currently a nonstarter. Likewise, you won’t get far in adding methods to an anonymous record, and the workarounds to this aren’t, in my opinion, particularly useful.
Although you can use anonymous records in type hints, I’m not convinced that you should do so. It leads to some pretty strange function headers, and in general, these look much simpler if done in terms of named record types declared separately.
Don’t discount the cognitive benefits of declaring a named record up front. When you name a record type, you are making a focused statement about what kind of thing you want to create and work with. If you are clear about that, a lot of the code that instantiates and processes instances of the record type will naturally “fall out” of that initial statement of intent.
Anonymous records are usually the best solution when interacting with C# code that produces or consumes anonymous objects.
Summary
Effective use of records is core to writing great F# code. It’s certainly my go-to data structure when I want to store small groups of labeled values. I only switch to classes (Chapter 8) when I find that I’m adorning my record types to the extent they might as well be classes – which is rarely. And any day I find that I’m using the with keyword with record types is a good day! I often use anonymous records to clarify code where, in earlier versions of F#, I might have used tuples.
All that said – classes have their place, even in F# code, so in the next chapter, we’ll talk about them in considerable detail.
Exercises
You need to store several million items, each consisting of X, Y, and Z positions (single precision) and a DateTime instance. For performance reasons, you want to store them on the stack.
How might you model this using an F# record?
How can you prove, in the simple case, that instantiating a million records works faster when the items are placed on the stack than when they are allowed to go on the heap?
You have an idea for a novel cache that stores expensive-to-compute items when they are first requested and periodically evicts the 10% of items that were least accessed over a configurable time period. Is a record a suitable basis for implementing this? Why or why not?
Don’t bother to actually code this – it’s just a decision-making exercise.
What’s the simplest way to fix the problem?
How would you alter the solution to Exercise 7-1 so that you return an array of struct anonymous records instead of struct named records? What effect does doing this have on performance?
Exercise Solutions
On my system, the instantiation took around 40ms with the [<Struct>] attribute and around 50ms without it. In reality, you’d need to check the whole life cycle of the items (instantiation, access, and release) in the context of the real system and volumes you were working on.
This sounds like something with a number of moving parts, including storage for the cached items, a timer for periodic eviction, and members allowing values to be retrieved independently of how they are stored internally. There is also, presumably, some kind of locking going on for thread safety. This clearly fulfills the criteria of “internal storage differs from external representation” and “has moving parts,” which means that one or more class types are almost certainly a more suitable approach than a record type.
In this simple case I couldn’t see any consistent difference between this version and a named record version.