Java – how to use hibernate to insert data as soon as possible
I read the file and create an object from it and store it in the PostgreSQL database My file has 100000 files. I read from one file, split it and finally store it in the database
public void readFile() { StringBuilder wholeDocument = new StringBuilder(); try { bufferedReader = new BufferedReader(new FileReader(files)); String line; int count = 0; while ((line = bufferedReader.readLine()) != null) { if (line.contains("<page>")) { wholeDocument.append(line); while ((line = bufferedReader.readLine()) != null) { wholeDocument = wholeDocument.append("\n" + line); if (line.contains("</page>")) { System.out.println(count++); addBodyToDatabase(wholeDocument.toString()); wholeDocument.setLength(0); break; } } } } wikiParser.commit(); } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { try { bufferedReader.close(); } catch (IOException e) { e.printStackTrace(); } } } public void addBodyToDatabase(String wholeContent) { Page page = new Page(new Timestamp(System.currentTimeMillis()),wholeContent); database.addPageToDatabase(page); } public static int counter = 1; public void addPageToDatabase(Page page) { session.save(page); if (counter % 3000 == 0) { commit(); } counter++; }
Solution
First, you should apply the fork join method here
The main task parses the file and sends batches of up to 100 items to executorservice Executorservice should have many worker threads equal to the number of available database connections If you have four CPU cores, assume that the database can use eight concurrent connections without too much context switching
Then, you should configure the connection pooling datasource and make minsize equal to maxsize and 8 Try using hikaricp or viburdbcp for connection pooling
Then you need to configure JDBC batching If you use mysql, identity generator will disable bathing If you are using a sequence enabled database, make sure you also use enhanced identifier generators (which are the default options in Hibernate 5. X)
In this way, the entity insertion process is parallelized and separated from the main parsing thread The main thread should wait for the executorservice to complete all tasks before shutting down