Tomcat Server Threading Model
Introduce#
Apache Tomcat works on a multi-threaded model for handling concurrent client requests. A thread pool assigns incoming connection requests to worker threads under the supervision of an Acceptor Thread. Reusing a thread instead of creating a new one for every request translates into performance gains.
Tomcat has different threading modes:
- BIO (Blocking I/O): One thread per request (legacy model).
 - NIO (Non-Blocking I/O): Event-driven fewer threads (default in modern Tomcat).
 - NIO2 (Asynchronous I/O): Provides a complete asynchronous way of high scalability.
 - APR (Apache Portable Runtime): Provides slightly better performance on Unix by using OS-level threading.
 
By configuring attributes like maxThreads, minSpareThreads, and acceptCount in server.xml, developers can optimize the threading model of Tomcat for high-performance applications.
Tomcat Threading Model#
Tomcat is a renowned open-source web application servlet container, and Spring Boot web applications default to the Tomcat server; it uses an efficient thread pool, referred to as TomcatExecutor, to process incoming HTTP requests. On the arrival of a request, it is processed by a worker thread, which takes that request from the pool and, after processing, returns it to the pool. This increases performance and decreases resource overhead while promoting resource utility.
Implementation:#
- Controller Class (Handles HTTP Requests)
 
- Service Class (Handles Business Logic with Asynchronous Calls)
 
CompletableFutureis used for asynchronous, non-blocking programming in Java, improving performance by running tasks in parallel. It enhances efficiency by freeing up threads and supports chaining (thenApply,thenCompose) and exception handling (exceptionally,handle). Ideal for parallel execution of independent tasks, making multiple API/database calls, and handling long-running operations efficiently.
For further information about CompletableFuture, you can check out this website https://www.codingshuttle.com/blogs/a-comprehensive-guide-to-java-completable-future/
- Data Class (Model for Student Information)
 
- Thread Configuration (Manages Thread Pool). Here I am using 
TaskSchedulerand 
ThreadPoolTaskExecutor
Tomcat Blocking Flow (Synchronous Processing)#
An example of a blocking model is when application logic is slow, for example, waiting for a database call or processing a long-running task.
Here, the assigned worker thread is blocked, and thus:#
- The thread will not accept or handle any new requests until the job in hand is complete.
 - If a number of threads become blocked waiting for a continuation of work, the new requests can end up queued or outright rejected.
 - It can cause performance bottlenecks and result in slower response times.
 - Asynchronous processing (for example, Spring's @Async, reactive programming, or optimized database queries) can help keep threads free and improve scalability, thus avoiding situations like the above.
 

Implementation:#
We need to call threads dynamically for this we need a class.
Synchronous request processing (blocking).
Output:#
After hitting the api.


.get() waits for each method to complete before moving to the next one.
- The methods execute sequentially, not in parallel.
 - Total execution time = 2s + 2s + 2s = 6 seconds.
 
Tomcat Async (Non-blocking) Flow#
In an asynchronous (non-blocking) flow, with a request that includes an async operation (like @Async or DeferredResult), the worker thread delegates the task to a callback thread (using either an executor or thread pool).
This is how it works:#
- A worker thread first handles the request.
 - Asynchronous operations are sent to a callback thread for execution.
 - The Tomcat worker thread now becomes free to keep servicing other requests.
 - When the async operation is complete, the response is forwarded to the client.
 
It is intended for scalability since a long-running task is not directly blocking a thread and therefore allows Tomcat to process several concurrent requests.

Implementation:#
We need to call threads dynamically for this we need a class.
Handles requests asynchronously.
Output:#
After hitting the api.


**CompletableFuture.allOf()**Instead of blocking .get(), we should:
- Start all async tasks simultaneously.
 - Wait for all of them to complete before proceeding.
 - Total Time Taken: ~2 seconds instead of ~6
 
Default Config for Tomcat#
Tomcat comes with the following default threading configurations:
- Maximum Threads: 
200(limits concurrent request handling). - Minimum Idle Threads: 
10(ensures a minimum number of ready worker threads). - Queue Size: Unbounded by default (can cause memory issues if too many requests pile up).
 
Customization in Spring Boot (application.properties)#
server.tomcat.threads.max=200
server.tomcat.threads.min-spare=10
server.tomcat.accept-count=100 //When the queue exceeds this limit, new
If the queue exceeds accept-count, new requests are rejected with HTTP 503 (Service Unavailable).
Spring Boot Thread Safety#
Spring Beans are by default thread-safe, as that means shared instances of Controllers, Services, and Repositories across multiple threads that tend to handle concurrent user requests.
Thread Safety Best Practices#
- Design components to be stateless: don't keep user-dependent data in common beans.
 - User state should be maintained in parameters: no shared mutable fields.
 
It is safe for repositories to have static configurations such as database URL, username, and password because these values are not user-specific at all.
Conclusion#
About the multi-threading model associated with Tomcat Server, this article presents a general view of how the worker threads handle incoming requests via a thread pool. It highlights the difference between blocking and non-blocking processing, how async operations affect overall computational performance, and thread configuration settings for better performance optimization.