Frametimes, FPS, median, percentiles, x% low ?
If you have ever looked at game benchmarks on the internet, you might have come across all these terms. But what exactly is hidden behind them is probably not clear to everyone and that's why we want to provide some clarity.
Frametimes: How is the performance of games measured?
In order to understand what the various metrics mean, it's important to know what data is being used in the first place.
Ingame benchmark tools, OCAT, FRAPS, PresentMon, CapFrameX - all these tools ultimately give the user FPS values, but these are not the values that are being measured.
What is measured are the time intervals between the individual images: The frametimes.
The measured frametimes are used to calculate the FPS, the formula is simple:
FPS = 1000 / frametimes
Frametimes = 1000 / FPS
As you can see, the FPS are nothing but the inverted values of the respective frametimes, so high frametimes mean low FPS while low frametimes mean high FPS.
All performance metrics are calculated based on the raw data of every measured frametime converted into an FPS value.
This term appears in many tests, the description always varies a little bit, from "Xth Percentile", "X% Percentile", "PX" or even more rarely just "X%".
But what they express is always the same, namely the value for which X% of all remaining values are smaller than this.
For game benchmarks this means: If my bench has 1000 different values and I sort these 1000 values from the smallest(1) to the largest(1000) value, the 99th percentile is the value at the 990th place, the 95th percentile is the value at the 950th place and so on.
So percentiles are not about times, but only about frames. Often you read explanations that a P1 of 45fps means that you are above 45fps 99% of the time. In practice, this is usually very close to the truth, but still not correct, because it only says that 99% of the rendered frames were above 45fps.
An extreme example: Let's assume we make a 20s recording with a P1 of 45fps. The ignored 1% of the frames now contain four huge outliers of 500ms each, so at least 10% of the time we were below 45fps(4 times of half a second without any moving image ).
CapFrameX also offers a view to make this fact visible, if you look at the threshold tab on the Analysis page you can see the total time spent below certain thresholds.
A special form of the percentile, which is rarely seen, is the median. This is the 50th percentile and thus the value that is exactly in the middle of the list sorted from small to large.
This makes the median insensitive to outliers that are very large but small in numbers, since it is always in the middle of the total number of frames, while the normal average is also influenced by the values of the individual frames.
If a game runs generally smooth this median is almost equal to the average of all values(average FPS).
Why use percentiles?
Percentiles are used in benchmarks to deliberately set a boundary above which values are ignored. This is especially useful for the "min FPS" values that were often used in the past, which are increasingly replaced by the 1st percentile.
The min FPS always indicated the lowest FPS value (i.e. the highest frametime value) that was measured over the complete benchmark.
Since the very worst frame is influenced by all sorts of things and can therefore vary greatly, it is not suitable for comparing the performance of different hardware.
The same is true for min FPS values that state the lowest FPS value averaged over one complete second. With this you can not see any outliers or micro-stutters.
If you take the 1st percentile instead and ignore the lowest 1% of the FPS values, you get a value that is much more stable and can be used to compare different hardware.
Moreover, it is also not very helpful for the private user to see a "min FPS" value that is based on a single outlier that occurred at a point where it is not representative for the gaming experience. With the P1, on the other hand, the user gets a value that is much closer to the minimum that he actively perceives during gaming.
And with the exclusion of upper and lower outliers, one can see quite well over the span between P1 and P99, even without a graph, how smooth the scene was.
If you want to ignore less outliers, because it turned out that the remaining frames are quite stable, you can switch to P0.2 or even P0.1, whereby you should have more than 1000 frames in your benchmark, so that at least 2-3 frames are dropped in the end.
Indication of percentiles and misleading information
We've already explained above that precentiles are not about time but about number of frames. Furthermore they don't form a "rating" by themselfes. How the values are sorted is not based on good or bad but always strictly from small to large and the respective percentile always describes that this percentage of values is smaller than the specified value.
This leads to the fact that the percentile to be used also changes depending on whether one is talking about frame times or FPS.
With FPS low values are bad, with frametimes low values are good.
So if you want to have the bad FPS values, you have to search for either high frametime values or low FPS values, so for example the 99th percentile of frametimes (99% of the values are lower) corresponds to the 1st percentile of FPS (99% of the values are higher).
Many sites still use the 99th percentile, even though they specify the values themselves in FPS, thus meaning the 1st percentile instead. For some reason this seems to have become common practice because most started out using frametime values for which P99 was the right description and now switched to FPS without adjusting the percentile (we're advocating for more sites using the correct description).
Some write "frametimes in FPS" behind it to make it a bit clearer what they are doing, namely take the 99th percentile of the frametimes and then convert and output this value in FPS, which would make it the 1st percentile of FPS according to absolutely correct formulation.
An aggravating factor is the fact that some ingame benchmarks use the term "min FPS", but refer to the 5th percentile, whereas elsewhere the 1st percentile or the actual min FPS is meant.
Therefore, for any kind of comparison, the following applies: Discuss which tools you use and especially which metrics you use. Otherwise, person A will wonder why their alleged min FPS, which the integrated benchmark has given them, are so much better than the min FPS that person B has determined with CapFrameX, OCAT etc.
You find this information very often, especially internationally, many tech-youtubers use it and the MSI Afterburner also spits out these metrics. Furthermore they are always used in connection with FPS and not with frametimes, so the low clearly refers to low and therefore bad FPS.
Compared to percentiles, there is no clear definition of how x% low values are to be calculated so here it is even more important to be clear about the tool or method you're using when comparing these values with others.
In some cases, with 1% low at 1000 values, the average of the 10 lowest values is calculated. This way, every outlier, no matter how small, is included in this value and the result is always lower than the 1st percentile.
The benefits of this approach may be debatable, there are advantages and disadvantages for both metrics. Some would like to have outliers of 1-2 frames in their evaluation, others do not want that and therefore take the percentiles.
It also depends on whether these outliers occur reproducibly or not. If a 0.1% low value is taken from only 2-3 frames and the benchmark scene has one or two frames randomly spiking in some runs without any connection to the hardware or settings used, you'll get a different result every time and therefor can't use 0.1% low to make any assumptions about the performance of the tested hardware or settings.
MSI Afterburner for example uses a different approach in which the x% low value is more similar to the percentile value but instead of counting frames, it's counting time. CapFrameX is also using this approach as of version 1.5.3
If the 1000 value benchmark from above took 20s in total, the frametimes get sorted from highest to lowest, so just like with the percentiles but reversed.
For 1% low starting at the first(highest) value in that list the frametime values are added one by one until their sum reaches or exceeds 1% of the total benchmark time, in this case 200ms. The converted FPS value of the frame that reached or exceeded that 200ms spot, will be the 1% low and compared to the standard 1% percentile, with this one you can really say that you are above X FPS 99% of the time.
Update 8/10/22:With version 1.7.0 CapFrameX will offer both of the x% low approaches mentioned above. They will be called "x% low average" for the average of the lowest x% of values and "x% low integral" for the value that you were below for x% of total benchmark time.
Additional info: percentiles for odd values
For those who would like to know a little bit more: What happens if you don't have exactly 1000 frames, but at 1070 FPS values you get 10.7 as 1st percentile? Some tools always round off to the next lower value, in this case 10, in order to make the values a little worse than a little too good. Some do exactly the opposite and some round up or down depending on the decimal point.
Others use mathematical formulas (too complicated for this explanation) to include the surrounding frametimes in the calculation of the final value and thus obtain a value that could have been at this theoretical position.
CapFrameX works in this way, so you will rarely see a value for the percentiles that actually appears as a number in the raw data.
Here, differences in the range of usually up to 0,5fps between the tools can occur, even if you would give all of them the same raw data, simply because the way of calculation is not 100% identical.