To be honest we only test with the SPI version on Arduino, with modern hardware and the TFT_eSPI library extremely high speeds can be obtained. On mbed we have an example that uses the STM32F4 LTDC frame buffer hardware (it acts as an ARM peripheral with DMA and memory mapping), the results as you would expect are exceptional, and the boards "STM32F429 DISC1" are pretty cheap with an inbuilt hardware debugger. We also test with Adafruit GFX as well, as a lot of users use that library.
Using FreeRTOS on a Mega 2560? That would seem quite a small board to use a multithreaded OS with, personally, I only use FreeRTOS on much larger boards (STM32F4, ESP32, RPI Pico). Something to bear in mind, some parts of Arduino are not thread-safe, using the Serial object on another thread I've not tried, you'd certainly not want to use it ever on any more than one thread, and there could well be assumptions about what thread it is running on. Especially on a MEGA 2560. I would just use a regular task manager task instead and use Serial.available() to check if anything is there, it does not block.
Something like
taskManager.scheduleFixedRate(10, [] {
while(Serial.available()) {
// read a byte from serial here and process..
}
});
Even within TcMenu, the assumption is that the core rendering code always runs on the main thread. Task manager integrates pretty well with FreeRTOS, so run a task manager loop on the main thread, and then you can offload work using CircularBuffers to be done on other threads.