HAL_SPI_Receive returns TIMEOUT in ThreadX and touchgfx

fernandogamax · ‎2024-02-08

Hello.

I have created a project with STM32H7, and in the project, I have included TouchGFX and ThreadX.

I have an external memory controlled by SPI5. Everything works correctly. However, on some occasions, the function:

status = HAL_SPI_Receive(hspiDFW25, bufferRX, bytes_to_read, 1000);

returns TIMEOUT.

If I give higher priority to the task where the HAL_SPI_Receive function is called than to the TouchGFX task, the error occurs less frequently. However, it still persists.

Any ideas?

fernandogamax · ‎2024-02-15

Hello, new tests

I have managed to make it work using HAL_SPI_recive_IT and HAL_SPI_recive_DMA.

But I can't get it to work HAL_SPI_recive.

The solution has been the priority of the interrupts. Configuring them as follows:

I would like to know if this saturation is normal or if I am configuring something wrong?

View solution in original post

Bob S · ‎2024-02-08

Are there still higher priority tasks? Maybe interrupts are taking too much time?

I suggest changing to HAL_SPI_ReceiveDMA() if you have the required DMA channel available. Or perhaps the _IT() version, though if interrupts are already causing issues adding an interrupt to handle each byte/word will only make things worse.

fernandogamax · ‎2024-02-08

I have tried to put maximum priority. I don't know the solution. I have observed that the problem is when I update the screen in touchgfx. I am changing the screen every 2 seconds. If I remove the screen update it never fails. Why can touchgfx cause the spi to return timeout?

fernandogamax · ‎2024-02-09

Hello. I've been doing more tests.

I have created a task, and in the task I have placed the following code:

while(1)
{
               status=HAL_SPI_Transmit(&hspi1,(uint8_t*) buffer_spi,128, 1000);
               if(status!=HAL_OK)
               {
                        consola_LIB_send_string("ERROR SPI");
               }

tx_thread_sleep(1);

}

With this code everything works correctly.

I change the code and put this other one.

while(1)
{
               status=HAL_SPI_Receive(&hspi1,(uint8_t*) buffer_spi,128, 1000);
               if(status!=HAL_OK)
               {
                        consola_LIB_send_string("ERROR SPI");
               }

tx_thread_sleep(1);

}

This continually skips timeout, even if I put more time on it, it always skips the timeout.

What difference exists between HAL_SPI_Receive and HAL_SPI_Transmit, so that only HAL_SPI_Receive timeout?

when the timeout occurs, this is the status of the tasks

fernandogamax · ‎2024-02-09

Bob S · ‎2024-02-09

As I said:

> Maybe interrupts are taking too much time?

Which is why I suggested using HAL_SPI_ReceiveDMA() and HAL_SPI_TransmitDMA(). With the polled HAL_SPI_Receive() that task is needlessly taking CPU time that other tasks could be using to do something (presumably) productive.

Though why that interrupt would affect receive and not transmit is a mystery. Some thoughts:

In the most recent code you posted, how big is "buffer_spi"?
Are you SURE the error you are getting is HAL_TIMEOUT? Your code above just checks for not HAL_OK.
I don't know about the H7 HAL code, but in the F4/F7 HAL_SPI_Receive() there are only 2 lines that can generate a HAL_TIMEOUT error: one for 8-bit data and one for everything else. Set breakpoints on those 2 lines and see what the SPIHandle member variables look like. Has it received ANY data? If so, how much?
You say "continually skips timeout, even if I put more time on it" - are you saying that even if you change the timeout from 1000 to, say, 10000 (10 seconds), you still get the timeout error (and you have verified it is timeout and not some other error)?

[EDIT] Maybe possibly could be a memory corruption issue, but look at the previous items before chasing this.

fernandogamax · ‎2024-02-09

Hello, thanks for your answers.

Maybe interrupts are taking too much time?

The interrupts I use don't take up much time.

In the most recent code you posted, how big is "buffer_spi"?

uint8_t buffer_spi[128];

Are you SURE the error you are getting is HAL_TIMEOUT? Your code above just checks for not HAL_OK.

I'm sure it's HAL_TIMEOUT, I set breakpoint when it's different from HAL_OK.

I don't know about the H7 HAL code, but in the F4/F7 HAL_SPI_Receive() there are only 2 lines that can generate a HAL_TIMEOUT error: one for 8-bit data and one for everything else. Set breakpoints on those 2 lines and see what the SPIHandle member variables look like. Has it received ANY data? If so, how much?

It can be seen that it has received some data.

You say "continually skips timeout, even if I put more time on it" - are you saying that even if you change the timeout from 1000 to, say, 10000 (10 seconds), you still get the timeout error (and you have verified it is timeout and not some other error)?

yes, it always skips timeout.

SPI configuration

threadX configuration

Bob S · ‎2024-02-12

> The interrupts I use don't take up much time.

That appears to be false since disabling an interrupt makes that code work. Or an issue with the tick timer, see below.

> yes, it always skips timeout

I'm sorry, I am having trouble with the way you phrase this. To me "skipping" timeout means it DID NOT get a timeout. I presume you mean that, yes, even if you set the timeout value to 10000 you still get a timeout.

Have you verified that the HAL tick timer is indeed running at 1000 interrupts/sec? I don't know how ThreadX/Azure handles things but with FreeRTOS, the normal SysTick timer is used by FreeRTOS and the HAL systick related functions (HAL_GetTick(), HAL_Delay(), etc.) must us one of the regular timers (TIM7 typically). If whatever timer is used by HAL_GetTick() is not counting much faster than 1000 Hz that would explain your problem. So verify that calling HAL_Delay(1000) really does delay for 1 second.

**IF** HAL_Delay() really does delay for 1 second, then it appears your system is pretty close to saturated and cannot handle the overhead of an interrupt-per-byte on the SPI. In that case, changing the SPI to use DMA might be your only solution (as I've mentioned before).

[EDIT] See this post for Azure SysTick vs. HAL : https://community.st.com/t5/stm32-mcus-embedded-software/stm32u5a5-azure-rtos-and-1ms-ticks-how-to-set/m-p/638541/highlight/true#M45282

fernandogamax · ‎2024-02-12

I'm sorry, I am having trouble with the way you phrase this. To me "skipping" timeout means it DID NOT get a timeout. I presume you mean that, yes, even if you set the timeout value to 10000 you still get a timeout.

yes.

Have you verified that the HAL tick timer is indeed running at 1000 interrupts/sec? I don't know how ThreadX/Azure handles things but with FreeRTOS, the normal SysTick timer is used by FreeRTOS and the HAL systick related functions (HAL_GetTick(), HAL_Delay(), etc.) must us one of the regular timers (TIM7 typically). If whatever timer is used by HAL_GetTick() is not counting much faster than 1000 Hz that would explain your problem. So verify that calling HAL_Delay(1000) really does delay for 1 second.

HAL_Delay(1000)=1segundo

If the system is saturated. How can I see what causes that saturation? I am using touchgfx, USBx fileX

Bob S · ‎2024-02-12

> If the system is saturated. How can I see what causes that saturation?

Use a code profiler, which means either adding code to all of your interrupt functions to document entry/ext, or using whatever features Azure/ThreadX provides. If you have access to the SWO signal, you might be able to use that for some level of profiling. Or you could tackle this one interrupt at a time by setting a GPIO pin at the beginning of the IRQ handler, and clearing it at the end. Of course, the time that GPIO is high will also include the time spent in any higher-priority interrupt handler.

You might be able to change the interrupt priorities to make this work, I'm not sure. Are all of your interrupts currently at the same priority? If so, that is not the best. There are certainly SOME interrupts that are more crucial than other interrupts.

Is you code running from external memory? I know very little about TouchGFX and the F4xx FMC, but I would expect code in internal FLASH to run faster than external memory.

Or - as I've suggested several times now - change your code. Use DMA, which only needs 1 interrupt at the end of the transfer. See if that fixes (or at least avoids) your problem.