Frametimes, FPS, median, percentiles, x% low ?
If you have ever looked at game benchmarks on the internet, you might have come across all these terms. But what exactly is hidden behind them is probably not clear to everyone and that's why we want to provide some clarity.
Frametimes: How is the performance of games measured?
In order to understand what the various metrics mean, it's important to know what data is being used in the first place.
Ingame benchmark tools, OCAT, FRAPS, PresentMon, CapFrameX - all these tools ultimately give the user FPS values, but these are not the values that are being measured.
What is measured are the time intervals between the individual images: The frametimes.
The measured frametimes are used to calculate the FPS, the formula is simple:
FPS = 1000 / frametimes
Frametimes = 1000 / FPS
As you can see, the FPS are nothing but the inverted values of the respective frametimes, so high frametimes mean low FPS while low frametimes mean high FPS.
All performance metrics are calculated based on the raw data of every measured frametime converted into an FPS value.
This term appears in many tests, the description always varies a little bit, from "99th Percentile" to "P99" or even more rarely "99%".
But what they express is always the same, namely the value for which X% of all remaining values are smaller than this.
For game benchmarks this means: If my bench has 1000 different values and I sort these 1000 values from the smallest(1) to the largest(1000) value, the 99th percentile is the value at the 990th place, the 95th percentile is the value at the 950th place and so on.
So percentiles are not about times, but only about frames. Often you read explanations that a P99 of 45fps means that you are above 45fps 99% of the time. In practice, this is usually very close to the truth, but still not correct, because it only says that 99% of the rendered frames were above 45fps.
An extreme example: Let's assume we make a 20s recording with a P99 of 45fps. The remaining 1% of the frames now contains four huge outliers of 500ms each, so at least 10% of the time we were below 45fps(4 times of half a second without any moving image ).
CapFrameX also offers a view to make this fact visible, if you look at the threshold tab on the Analysis page you can see the total time spent below certain thresholds.
A special form of the percentile, which is rarely seen, is the median. This is the 50th percentile and thus the value that is exactly in the middle of the list sorted from small to large.
This makes the median insensitive to outliers, which are very large but small in numbers, since it is always in the middle of the total number of frames, while the normal average is also influenced by the values of the individual frames.
If a game runs generally smooth this median is almost equal to the average of all values(average FPS).
Why use percentiles?
Percentiles are used in benchmarks to deliberately set a boundary above which values are ignored. This is especially useful for the "min FPS" values that were often used in the past, which are increasingly replaced by the 99th percentile.
The min FPS always indicated the lowest FPS value (i.e. the highest frametime value) that was measured over the complete benchmark.
Since the very worst frame is influenced by all sorts of things and can therefore vary greatly, it is not suitable for comparing the performance of different hardware.
If you take the 99th percentile instead and ignore the highest 1% of the frametimes, you get a value that is much more stable and can be used to compare different hardware.
Moreover, it is also not very helpful for the private user to see a "min FPS" value that is based on a single outlier that occurred at a point where it is not representative for the gaming experience. With the P99, on the other hand, the user gets a value that is much closer to the minimum that he actively perceives during gaming.
And with the exclusion of upper and lower outliers, one can see quite well over the span between P1 and P99, even without a graph, how evenly the frametimes were.
If you want to ignore less outliers, because it turned out that the remaining frames are quite stable, you can switch to P99.8 or even P99.9, whereby you should have more than 1000 frames in your benchmark, so that at least 2-3 frames are dropped in the end.
Indication of percentiles and misleading information
Furthermore, percentiles are not about good or bad values, but about small or large values, so the percentile to be used also changes depending on whether you are talking about frametimes or FPS.
With FPS low values are bad, with frametimes low values are good.
So if you want to have the bad FPS values, you have to search for either high frametime values or low FPS values, so for example the 99th percentile of frametimes (99% of the values are lower) corresponds to the 1st percentile of FPS (99% of the values are higher).
Most sites still use the 99th percentile, even though they specify the values themselves in FPS, for some reason this seems to have become common practice and you would only cause confusion if you broke with these terms (CapFrameX does it anyway ;) ).
Some write "frametimes in FPS" behind it to make it a bit clearer what they are doing, namely take the 99th percentile of the frametimes and then convert and output this value in FPS, which would make it the 1st percentile of FPS according to absolutely correct formulation.
An aggravating factor is the fact that some ingame benchmarks use the term "min FPS", but refer to the 5th percentile, whereas elsewhere the 1st percentile or the actual min FPS is meant.
Therefore, for any kind of comparison, the following applies: Discuss which tools you use and especially which metrics you use. Otherwise, person A will wonder why their alleged min FPS, which the integrated benchmark has given them, are so much better than the min FPS that person B has determined with CapFrameX, OCAT etc.
You find this information very often, especially internationally, almost all tech-youtubers use it and the MSI Afterburner also spits out these metrics. Furthermore they are always used in connection with FPS and not with frametimes, so the low clearly refers to low and therefore bad FPS.
Basically, it works similar to the percentiles, but with 1% low at 1000 values, not the 10th lowest value is taken, but the average of the 10 lowest values is calculated. This way, every outlier, no matter how small, is included in this value and the result is always lower than the 1st percentile.
The benefits may be debatable, there are advantages and disadvantages for both metrics. Some would like to have outliers of 1-2 frames in their evaluation, others do not want that and therefore take the percentiles.
It also depends on whether these outliers occur reproducibly or not. If a 0.1% low value is taken from only 2-3 frames and the benchmark scene has one or two frames randomly spiking in some runs without any connection to the hardware or settings used, you'll get a different result every time and therefor can't use 0.1% low to make any assumptions about the performance of the tested hardware or settings.
Additional info: percentiles and x% low metrics for odd values
For those who would like to know a little bit more: What happens if you don't have exactly 1000 frames, but at 1070 FPS values you get 10.7 as 1st percentile? Some tools always round off to the next lower value, in this case 10, in order to make the values a little worse than a little too good. Some do exactly the opposite and some round up or down depending on the decimal point.
Others use mathematical formulas (too complicated for this explanation) to include the surrounding frametimes in the calculation of the final value and thus obtain a value that could have been at this theoretical position.
CapFrameX works in this way, so you will rarely see a value for the percentiles that actually appears as a number in the raw data.
Here, differences in the range of usually up to 0,5fps between the tools can occur, even if you would give all of them the same raw data, simply because the way of calculation is not 100% identical.
With the x% low values, this exact calculation of a single value is not that relevant due to the subsequent averaging, therefore e.g. CapFrameX rounds up and down here. Thus, at 1% low with 1070 frames the average of the worst 11 frames is output, at 1040 frames it would be the average of the worst 10 frames.