I’ve noticed in a lot of RSES Thursday seminars and generally in the media that people often ask why is a particular model showing something but not something else. How is it that it is showing this phenomenon but not that other phenomenon. Or they point out how something can’t be because their data are not showing the same thing.

This goes in general for any model… whether it’s a climate model, a model of ocean currents or a combination thereof, a geodynamic model… anything.

In my work I deal more with inversions (which is a somewhat opposite process to modeling) but in order to test my inversions I also had to do some modeling. In this post I will concentrate on what modeling is and how it works and then you can extrapolate this to any scientific model you are looking into.

What does it mean to model something? Some of you have probably had various earth science courses where you were forced to derive a particular set of equations that you were told describe a certain process.

Shallow water equations? Navier-Stokes equations? Equations of conservation of mass/momentum/energy? Water salinity equations? Some of those most certainly sound familiar (I hope). They are partial differential equations describing how a certain medium (air, water, solid or not solid earth) behaves in equilibrium and when there are any forces applied. You have probably discovered for yourself, or you were told, that these equations are difficult to solve… and they are. Usually in academic courses they are solved for specific extreme cases – when there are no forces acting on a medium, assuming no turbulence, no external or internal heat sources of any kind, and they are solved either for a specific location in space and we are observing how this changes through time (Eulerian approach) or we pretend that as observers of some fluid we follow a fluid parcel around through space and time (Lagrangian approach).

To solve these equations analytically (meaning using the mathematical rules for solving them) would take a lot of time and effort. To do so for many locations in space and/or for a long time period is a known torture technique in Guantanamo prison (not really). We use computers to solve these equations. However, as I always like to point out computers are (for now) utter idiots, dumb machines that only do exactly what you tell them to do. Computer CANNOT solve partial differential equations exactly (the way you would do on a piece of paper using mathematical rules). These equations are solved numerically, using iterations and approximations. One of the most common ways to solve them is to use a finite-element technique. This means that instead of treating a derivative as infinitely small change of some property (which it is by definition) you treat it as a small finite quantity. Summing a lot of small finite quantities along a given contour approximates an integral for example.

An example of how this works: I’m a seismologist and I use seismic waves and their corresponding raypaths to infer on the structure of deep earth. Given a source – the location of the earthquake – and a receiver – a seismic station that recorded it – I can easily get the great circle along which seismic waves propagated for this particular combination of starting and ending points (figure 1). I use a little piece of software that does this for me and within a matter of seconds prints out a number of points through which this wave passed. These points are characterised by their angular distance from the source, their location (latitude and longitude) and radius within the earth. That’s all I need to know where the front of the wave is as it propagates through time. Now, I need travel times through a particular part of the interior, in my case the inner core. I don’t care about the total travel time, or travel time through the mantle, I just need the travel time through the core.

All I have is those angular distances, radii and assumed velocities of propagation within the inner core. How do I get travel times? And better yet how do I get an integral of a travel time along a contour that is part of the seismic ray within the core? Time is distance over velocity. My point A is at an angular distance alpha1 from the source and the point B next to it is at an angular distance alpha2 from the source. The difference of these two angles gives me an angular distance between points A and B. I have the radii of these points as well. I can now use the angle between A and B and the average radius of the two to get a circular sector, which is the distance between points A and B. Given some velocity assumption for this part of the Earth I can now easily get the travel time between points A and B. Doing this for all the points within the inner core, following them along the propagation path (remember this is given to me via radii, location and angular distances from the source) and summing each segment gives me the integral of total travel time through the core. This has finite-element written all over it.

ray_tracing002
Figure 1 – Model of ray propagation through the Earth. The ray propagates from the source towards receiver – a station on the surface of the Earth. In order to model the propagation we are tracing the ray along a finite number of points. r is the radius from the centre of the Earth to discrete points A and B. Notice that this radius changes along the ray path, but in this case if points are very close their radii are almost equal and that’s how they are shown here for simplicity. Angular distances are also shown (see text for explanation). Also I love my hand drawings. I’m sorry if you don’t.

And this is a very trivial example of how you solve a partial differential equation – you divide elements of it into small finite parts and you sum them, or you subtract them…. You do what is required in your specific problem. In my case, I have about a 100 points that stretch through the inner core so the above procedure is on average performed a 100 times per ray. I have about 500 rays in my current dataset, so to get travel times through the core for all my seismic rays this simple procedure described above needs to be repeated about 50 000 times. It takes about half an hour on my computer. And this is modeling. It is modeling the propagation of a seismic ray through the inner core.

This was a simple example and is easy to check – if I do this for the entire ray path, not just the inner core, and if the travel time thus obtained matches the travel time recorded at a seismic station – my model is correct.  This means the assumption on the propagation velocity within the medium (inner core or otherwise) was good too! It is also possible to use data to infer on the deviations of real Earth from some velocity model. An example of that is shown in figure 2. Ray tracing is used in this kind of procedure too.

Now imagine modeling something more complicated. Tsunami for example is described by shallow water equations which are a set of three fluid equations under certain conditions. There are also boundaries (sea floor and sea surface) that need to be taken into account. To model a tsunami propagating from somewhere in the ocean (where an earthquake occurred) towards a coastline is no small deal. All these equations need to be solved simultaneously over a large portion of space and time with the initial conditions included (this includes knowledge of the earthquake source, what kind of sea floor displacement this earthquake caused, what is the resulting amplitude of the tsunami wave etc etc). The run time of one such model can easily take anywhere between 4 and 48 hours.

victoria_with_outline
Figure 2 – The result of an inversion of seismic P-wave velocity deviation from the AK135 reference model from approximately 20,000 tele-seismic arrivals. Red regions represent slower velocitys and blue faster. Regions in good agreement with the reference model are made transparent. (Rawlinson et al, “The structure of the upper mantle beneath the Delamerian and Lachlan orogens from simultaneous inversion of multiple teleseismic datasets”, Gondwana Research, 2011; Hawkins and Sambridge, “Geophysical imaging using trans-dimensional trees”, GJI, 2015). Courtesy of Rhys Hawkins

Modeling anything in geodynamics is even more difficult because you can’t really see what is happening in the mantle for example – so you have to rely on known physics/chemistry and see whether your models in general predict something (anything) that you can observe (a surface manifestation maybe?) but at the same time not violating the properties of the mantle (buoyancy, continental drift, accepted temperatures within the mantle….). Check out an example of some mantle modeling in figure 3.

Figure 3 – To model convection of viscous fluids, like the mantle, you must solve the equations governing the conservation of mass, momentum and energy. Modern modelling frameworks typically use the finite difference method or the finite element method. The image is a temperature isosurface (surface connecting points of equal temperature) taken from a 3D box model. The isosurface provides an outline of a mantle plume that has risen form the core mantle boundary, passed through an upper mantle phase transition and begun spread out across the base of the lithosphere. Courtesy of Tim Jones

This can tell you something: if your model is not showing a particular phenomenon that you’re expecting to see, it means that you either don’t have the equations that govern this process included in your model, or if you think you have equations that do govern this process/phenomenon than those equations simply do not predict it.

This is a simple way of looking at it. You can also be looking at wrong scales… Is your phenomenon some sort of a blob of high temperature cruising through Pacific? Is this blob 50m in diameter and you are modeling the behaviour of Pacific every 200m? You will not see it – in order to see something that is 50m in diameter, you need to have the points of your model spaced at least every 25m.

Is this a phenomenon that occurs every 10 days but you are running your model in steps of one year in time? Or is it a phenomenon that occurs every 1000 years but you ran your model in such a way that it shows only the prediction for 700 years in the future?

Do you see how complicated it can get? Not only do you have to solve complicated equations but you have to worry about boundary conditions, time steps, spatial steps (the resolution)… And the more spatial and/or time points you have the longer it takes for a computer program to finish and it consumes more power. This is when people doing the modeling say it is ‘expensive’ to run something – they don’t (necessarily) mean that it is financially expensive – it is computationally expensive – it takes computer power and time to run a model.

Next time when you wonder about a particular model and why it does or doesn’t show something think about all of that – ask yourself (or better – the presenter!) what set of equations are used, what approximations and assumptions are controlling them (there always are approximations and assumptions, turbulence is one of many examples that is nigh on impossible to describe/model), what are the boundary conditions, what is the resolution of the model, for which period of time had someone run it… Then also think about what was the primary purpose of creating this model? What was it meant to show? Is it not showing something because it’s not there or is it because the authors of the model are not even trying to see that particular thing so maybe they deliberately disregarded it to save on computation costs? Or maybe the model is showing something that no one expected? If the model is confirmed through actual data it is probably a good and to an extent correct model. If the data from different sources are consistently showing something that is not in the model (but we made sure that it should be there!) then models need changing….

And remember, as my good friend Rhys pointed me to the quotes by George Box:

“Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful”.

“Essentially, all models are wrong, but some are useful.”