Spring Batch Partition for Scaling & Parallel Processing

For Scaling & Parallel Processing, Spring Batch provides various solutions: Multi-threaded Step, Parallel Steps, Remote Chunking of Step & Partitioning a Step. In the tutorial, JavaSampleApproach will introduce Partitioning a Step cleary by a sample project.

Related articles:
Spring Batch Job with Parallel Steps
How to use Spring Batch Late Binding – Step Scope & Job Scope


Spring Batch provides an solution for partitioning a Step execution by remotely or easily configuration for local processing.

How it work?


The Job in left hand is executed sequentially, Master step is partitioning step that has some Slave steps. Slave steps can be remote services or local threads.

For configuring partitioning step, Spring Batch provies PartitionHandler component & Partitioner interface.

1. PartitionHandler

The componenet PartitionHandler knows about the kind of remote services(RMI remoting, EJB remoting,… or local threads) or grid numbers. PartitionHandler can send StepExecution requests to the remote Steps, in various format, like a DTO.

How to configure?

The gridSize defines the number of step executions, so we should to consider the size of TaskExecutor’s thread pool.

2. Partitioner

The Partitioner interface is used to build execution contexts as input parameters for step executions.

Map contains a unique name for each step execution that associated with ExecutionContext’s value.

3. How to Binding input Map to Steps

StepScope feature of Spring Batch can help us to late binding data from PartitionHandler to each step at runtime.
See more: How to use Spring Batch Late Binding – Step Scope & Job Scope


In the tutorial, we create a Batch Job that has only one partition step with 5 slave steps for inserting data from 5 csv files to MySql database.
Spring batch Partition - overview


– Java 1.8
– Maven 3.3.9
– Spring Tool Suite – Version 3.8.1.RELEASE
– Spring Boot: 1.5.1.RELEASE
– MySQL Database 1.4

Step to do

– Create Spring Boot project
– Create a simple model
– Create DAO class
– Create Batch Job Step

– Create Batch Job Partitioner
– Configure Partitional Batch Job

– Create JobLaunchController
– Create 5 csv files
– Run & Check results

1. Create Spring Boot project

Create a Spring Boot project with needed dependencies:
– spring-boot-starter-batch
– spring-boot-starter-web
– mysql-connector-java

2. Create a simple model

3. Create DAO class

4. Create Batch Job Step

– Create Reader.java:

– Create Writer.java:

– Create Processor.java:

5. Create Batch Job Partitioner

6. Configure Partitional Batch Job

– Create a batchjob.xml configuration file:

– Open application.properties file, configure DataSource info:

– In main class, enable batch job proccessing:

7. Create JobLaunchController

8. Create 5 csv files

Create 5 csv files {customer-data-1.csv, customer-data-2.csv, customer-data-3.csv, customer-data-4.csv, customer-data-5.csv}
With Customer’s info:

9. Run & Check results

– Build & Run the project with Spring Boot App mode.
– Create a database’s table with SQL:

– Then makes a launch request: localhost:8080/runjob
– Result:

Spring batch Partition - result



By grokonez | March 3, 2017.

Last updated on June 4, 2017.

Related Posts

13 thoughts on “Spring Batch Partition for Scaling & Parallel Processing”

  1. If we have 100 customer-data-*.csv files and if we use grid-size = 100 then there will be 100 threads which is not logical. Hence what is the way to solve this issue i.e. process many files with limited number of threads.

    1. Hi Arpit Garg,

      The number of thread depends on your infrastructure, So you can configure them. The heart of tutorial is how to use PartitionHandler. And understand the Master-Slave steps.
      Your problems can have some ways to resolve:
      – Use 10 threads to process 100 files, each thread processes 10 files.
      – Use 5 threads to process 100 files, each thread handles 20 files.

      1. I have to implement below scenario:
        Use 10 threads to process 100 files, each thread processes 10 files

        Can you suggest any solution for that??

  2. How to verify if “partitioner” execution(all slaves completed execution) completed before going to next step ?

    1. Hi,

      For monitoring the executions of all partitioners, I suggests you can logs time in Partitioners and each steps.
      Then you can make sure the order of the executing flows.

  3. Hi Team,
    I have implemented for similar way with Batch process in the spring boot using annotation it works as rest service for me.
    I am using SimpleAsyncTaskExecutor and reader and processor and writer to read from db and validate in process and inserting into database.
    when one job is launched if i give one more job to run it then its batch is failing.
    i am getting all parameter dynamicllay from webservice.

    Thanks in advance.

    below is the exception.

    2018-07-12 19:21:23.015 ERROR 24304 — [cTaskExecutor-2] o.s.batch.core.step.AbstractStep : Encountered an error executing step step in job itemDataQualtiyReport

    org.springframework.batch.item.ItemStreamException: Failed to initialize the reader
    at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.open(AbstractItemCountingItemStreamItemReader.java:149)
    at org.springframework.batch.item.database.JdbcPagingItemReader.open(JdbcPagingItemReader.java:260)
    at org.springframework.batch.item.database.JdbcPagingItemReader$$FastClassBySpringCGLIB$$42c8e250.invoke()
    at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
    at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:747)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
    at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185)
    at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
    at com.unilog.cimm.reports.batch.reader.Cimm2ItemReader$$EnhancerBySpringCGLIB$$6a6d34be.open()
    at org.springframework.batch.item.support.CompositeItemStream.open(CompositeItemStream.java:103)
    at org.springframework.batch.core.step.item.ChunkMonitor.open(ChunkMonitor.java:114)
    at org.springframework.batch.item.support.CompositeItemStream.open(CompositeItemStream.java:103)
    at org.springframework.batch.core.step.tasklet.TaskletStep.open(TaskletStep.java:310)
    at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:197)
    at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:148)
    at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:66)
    at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:67)
    at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:169)
    at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:144)
    at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:136)
    at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:308)
    at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:141)
    at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.IllegalStateException: Cannot open an already opened ItemReader, call close first
    at org.springframework.util.Assert.state(Assert.java:73)
    at org.springframework.batch.item.database.AbstractPagingItemReader.doOpen(AbstractPagingItemReader.java:133)
    at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.open(AbstractItemCountingItemStreamItemReader.java:146)
    … 23 common frames omitted

  4. Hi,

    I am using spring batch for reading csv file and inserting into db. Single file is getting inserted automatically.
    But, I need to read and insert two csv file and insert it into database simultaniously.

    Thanks in advance.



  5. Hi ,
    I have one question regarding number of instance created for Reader,Processor,Writer.
    Is it created for each thread ?.
    Or there are basically same single instance of Reader,Processor,Writer shared by each thread with different input parameters.
    Please respond.

  6. Can u implement a partitioning for reading the records from a single cab file.Scenario is that I need to implement a partitioner for reading. 20 ,. 20 record from single csv.this file has 100 records.
    File always should be the same

Got Something To Say:

Your email address will not be published. Required fields are marked *