Difference between revisions of "I.MX Linux temperature"

From Boundary Devices Wiki

(Created page with "<p>We have received many requests about temperature monitoring on our NXP i.MX based Nitrogen platforms.</p> <p>This blog post aims to answer the most common questions and hel...")
 
Line 1: Line 1:
<p>We have received many requests about temperature monitoring on our NXP i.MX based Nitrogen platforms.</p>
+
We have received many requests about temperature monitoring on our NXP i.MX based Nitrogen platforms.
<p>This blog post aims to answer the most common questions and help you get the most out of your Nitrogen device.</p>
+
 
<p>[[File:8MP_thermometer-300x186.jpg|frame]]</p>
+
This blog post aims to answer the most common questions and help you get the most out of your Nitrogen device.
<p>blah</p>
+
 
<h2>Temperature Monitoring principle</h2>
+
=Temperature Monitoring principle=
<p>All the i.MX processors come with an integrated IP capable of providing the internal temperature of the SoC.</p>
+
 
 +
All the i.MX processors come with an integrated IP capable of providing the internal temperature of the SoC.
 +
 
 
[[File:IMX8MPLUS-temp-1024x576.jpg]]
 
[[File:IMX8MPLUS-temp-1024x576.jpg]]
<p>That temperature monitoring IP is not the same across all i.MX CPUs. The older i.MX 6 &amp; 7 used to rely on the TEMPMON IP whereas the newer i.MX 8M (Quad, Mini, Nano &amp; Plus) use the TMU.</p>
+
 
<h2>How do you read the temperature?</h2>
+
That temperature monitoring IP is not the same across all i.MX CPUs. The older i.MX 6 &amp; 7 used to rely on the TEMPMON IP whereas the newer i.MX 8M (Quad, Mini, Nano &amp; Plus) use the TMU.
<p>The good news is that no matter which platform you use, the way to read the temperature is always the same:</p>
+
 
<pre class="brush: shell"># cat /sys/devices/virtual/thermal/thermal_zone0/temp
+
=How do you read the temperature?=
 +
 
 +
The good news is that no matter which platform you use, the way to read the temperature is always the same:
 +
<pre># cat /sys/devices/virtual/thermal/thermal_zone0/temp
 
37000</pre>
 
37000</pre>
<p>The above means my platform temperature is currently 37 degrees Celsius as the unit is millidegree as described in the kernel documentation:</p>
+
 
<ul>
+
The above means my platform temperature is currently 37 degrees Celsius as the unit is millidegree as described in the kernel documentation:
+
* https://www.kernel.org/doc/html/v5.4/driver-api/thermal/sysfs-api.html#thermal-zone-attributes
</ul>
+
 
<p>Behind the scenes, as the IP differs between CPUs, different kernel drivers can be in use:</p>
+
Behind the scenes, as the IP differs between CPUs, different kernel drivers can be in use:
<ul>
+
 
+
* [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/drivers/thermal/imx_thermal.c imx_thermal.c] for i.MX 6 and i.MX 7 processors (TEMPMON)
+
* [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/drivers/thermal/imx8mm_thermal.c imx8mm_thermal.c] for i.MX 8M Mini / Nano / Plus processors (TMU)
+
* [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/drivers/thermal/qoriq_thermal.c qoriq_thermal.c] for i.MX 8M Quad processors (TMU)
</ul>
+
 
<h2>How accurate is the measurement?</h2>
+
=How accurate is the measurement?=
<p>Some of you have tried to compare a case temperature measurement against the one read by the IP which isn't a fair comparison.</p>
+
 
<p>First, the internal temperature monitor measures <strong>die temp</strong> (also called <strong>junction temp</strong>), with an accuracy of +5C.</p>
+
Some of you have tried to compare a case temperature measurement against the one read by the IP which isn't a fair comparison.
<p>Second, the difference between junction and case temperature is linear to the wattage of your platform, so it increases the more power you draw.</p>
+
 
<p>In other words the die temperature is what should be used to avoid damaging your platform.</p>
+
First, the internal temperature monitor measures '''die temp''' (also called '''junction temp'''), with an accuracy of +5C.
<h2>Is there a safety mechanism?</h2>
+
 
<p>The first question often is: "how do I make sure not to damage my platform in case it gets hot?"</p>
+
Second, the difference between junction and case temperature is linear to the wattage of your platform, so it increases the more power you draw.
<p>The safety mechanism for that defines <strong>2 thresholds</strong>:</p>
+
 
<ul>
+
In other words the die temperature is what should be used to avoid damaging your platform.
<li><strong>Passive</strong> trip point: when reached, the <span style="text-decoration: underline;">CPU/GPU/VPU throttles</span> to try to reduce the temperature</li>
+
 
<li><strong>Active</strong> trip point: when reached, the <span style="text-decoration: underline;">board reboots immediately</span> to avoid damaging the SoC</li>
+
=Is there a safety mechanism?=
</ul>
+
 
<p>Those thresholds are usually 10C apart, like the default values on our 8MP SOM like this:</p>
+
The first question often is: "how do I make sure not to damage my platform in case it gets hot?"
<pre class="brush: shell"># cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*temp
+
 
 +
The safety mechanism for that defines '''2 thresholds''':
 +
 
 +
* '''Passive''' trip point: when reached, the <u>CPU/GPU/VPU throttles</u> to try to reduce the temperature
 +
* '''Active''' trip point: when reached, the <u>board reboots immediately</u> to avoid damaging the SoC
 +
 
 +
Those thresholds are usually 10C apart, like the default values on our 8MP SOM like this:
 +
 
 +
<pre># cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*temp
 
85000
 
85000
 
95000</pre>
 
95000</pre>
<h2>How to modify the thresholds?</h2>
+
 
<p>Depending on the exact version of the CPU you are using, you might need to change the default thresholds.</p>
+
=How to modify the thresholds?=
<p>This is not necessary for i.MX 6 &amp; 7 as the driver checks the temperature grade and adapts temperatures automatically:</p>
+
 
<ul>
+
Depending on the exact version of the CPU you are using, you might need to change the default thresholds.
+
 
</ul>
+
This is not necessary for i.MX 6 &amp; 7 as the driver checks the temperature grade and adapts temperatures automatically:
<p>For the other processors, like an industrial version of the i.MX 8M Mini CPU that can go from -40C to 105C, you need to update the trip points as follows:</p>
+
 
<pre class="brush: shell"># echo 105000 &gt; /sys/devices/virtual/thermal/thermal_zone0/trip_point_1_temp  
+
* [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/drivers/thermal/imx_thermal.c#L566 imx_init_temp_grade()]
 +
 
 +
For the other processors, like an industrial version of the i.MX 8M Mini CPU that can go from -40C to 105C, you need to update the trip points as follows:
 +
 
 +
<pre># echo 105000 &gt; /sys/devices/virtual/thermal/thermal_zone0/trip_point_1_temp
 
# echo 95000 &gt; /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp
 
# echo 95000 &gt; /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp
 
# cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*temp
 
# cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*temp
 
95000
 
95000
 
105000</pre>
 
105000</pre>
<p>Note that setting those thresholds this way is not persistent across reboot, so you have 2 solutions:</p>
 
<ol>
 
<li>Modify the device tree directly to have your own values
 
  
<ul>
+
Note that setting those thresholds this way is not persistent across reboot, so you have 2 solutions:
<li>i.MX 8M Quad: imx8mq.dtsi</li>
 
<li>i.MX 8M Mini: imx8mm.dtsi</li>
 
<li>i.MX 8M Nano: imx8mn.dtsi</li>
 
<li>i.MX 8M Plus: imx8mp.dtsi</li>
 
</ul>
 
</li>
 
<li>Create an init script for your OS to do that at bootup
 
  
<ul>
+
# Modify the device tree directly to have your own values
+
#* i.MX 8M Quad: [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/arch/arm64/boot/dts/freescale/imx8mq.dtsi#L224 imx8mq.dtsi]
</ul>
+
#* i.MX 8M Mini: [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/arch/arm64/boot/dts/freescale/imx8mm.dtsi#L402 imx8mm.dtsi]
</li>
+
#* i.MX 8M Nano: [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/arch/arm64/boot/dts/freescale/imx8mn.dtsi#L303 imx8mn.dtsi]
</ol>
+
#* i.MX 8M Plus: [https://github.com/boundarydevices/linux-imx6/blob/boundary-imx_5.4.x_2.3.0/arch/arm64/boot/dts/freescale/imx8mp.dtsi#L467 imx8mp.dtsi]
 +
# Create an init script for your OS to do that at bootup
 +
#* [https://www.suse.com/support/kb/doc/?id=000019672 Create a simple systemd service example]

Revision as of 15:22, 26 May 2021

We have received many requests about temperature monitoring on our NXP i.MX based Nitrogen platforms.

This blog post aims to answer the most common questions and help you get the most out of your Nitrogen device.

Temperature Monitoring principle

All the i.MX processors come with an integrated IP capable of providing the internal temperature of the SoC.

IMX8MPLUS-temp-1024x576.jpg

That temperature monitoring IP is not the same across all i.MX CPUs. The older i.MX 6 & 7 used to rely on the TEMPMON IP whereas the newer i.MX 8M (Quad, Mini, Nano & Plus) use the TMU.

How do you read the temperature?

The good news is that no matter which platform you use, the way to read the temperature is always the same:

# cat /sys/devices/virtual/thermal/thermal_zone0/temp
37000

The above means my platform temperature is currently 37 degrees Celsius as the unit is millidegree as described in the kernel documentation:

Behind the scenes, as the IP differs between CPUs, different kernel drivers can be in use:

How accurate is the measurement?

Some of you have tried to compare a case temperature measurement against the one read by the IP which isn't a fair comparison.

First, the internal temperature monitor measures die temp (also called junction temp), with an accuracy of +5C.

Second, the difference between junction and case temperature is linear to the wattage of your platform, so it increases the more power you draw.

In other words the die temperature is what should be used to avoid damaging your platform.

Is there a safety mechanism?

The first question often is: "how do I make sure not to damage my platform in case it gets hot?"

The safety mechanism for that defines 2 thresholds:

  • Passive trip point: when reached, the CPU/GPU/VPU throttles to try to reduce the temperature
  • Active trip point: when reached, the board reboots immediately to avoid damaging the SoC

Those thresholds are usually 10C apart, like the default values on our 8MP SOM like this:

# cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*temp
85000
95000

How to modify the thresholds?

Depending on the exact version of the CPU you are using, you might need to change the default thresholds.

This is not necessary for i.MX 6 & 7 as the driver checks the temperature grade and adapts temperatures automatically:

For the other processors, like an industrial version of the i.MX 8M Mini CPU that can go from -40C to 105C, you need to update the trip points as follows:

# echo 105000 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_1_temp
# echo 95000 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp
# cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_*temp
95000
105000

Note that setting those thresholds this way is not persistent across reboot, so you have 2 solutions:

  1. Modify the device tree directly to have your own values
  2. Create an init script for your OS to do that at bootup