【发布时间】:2019-04-24 19:32:07
【问题描述】:
两天来,我一直在尝试使用 Spring-Data-JPA 在我的 Postgres 数据库中存储一个包含大约 600 万个条目的数组列表。 整个过程有效,但速度很慢。我需要大约 27 分钟的时间。 我已经尝试过批量大小,但这并没有带来太大的成功。我还注意到,桌子越大,保存的时间就越长。有没有办法加快速度? 我以前用 SQLite 完成了所有的事情,同样的数量我只需要大约 15 秒。
我的实体
@Data
@Entity
@Table(name = "commodity_prices")
public class CommodityPrice {
@Id
@Column( name = "id" )
@GeneratedValue( strategy = GenerationType.SEQUENCE )
private long id;
@Column(name = "station_id")
private int station_id;
@Column(name = "commodity_id")
private int commodity_id;
@Column(name = "supply")
private long supply;
@Column(name = "buy_price")
private int buy_price;
@Column(name = "sell_price")
private int sell_price;
@Column(name = "demand")
private long demand;
@Column(name = "collected_at")
private long collected_at;
public CommodityPrice( int station_id, int commodity_id, long supply, int buy_price, int sell_price, long demand,
long collected_at ) {
this.station_id = station_id;
this.commodity_id = commodity_id;
this.supply = supply;
this.buy_price = buy_price;
this.sell_price = sell_price;
this.demand = demand;
this.collected_at = collected_at;
}
}
我的插入类
@Slf4j
@Component
public class CommodityPriceHandler {
@Autowired
CommodityPriceRepository commodityPriceRepository;
@Autowired
private EntityManager entityManager;
public void inserIntoDB() {
int lineCount = 0;
List<CommodityPrice> commodityPrices = new ArrayList<>( );
StopWatch stopWatch = new StopWatch();
stopWatch.start();
try {
Reader reader = new FileReader( DOWNLOAD_SAVE_PATH + FILE_NAME_COMMODITY_PRICES );
Iterable<CSVRecord> records = CSVFormat.EXCEL.withFirstRecordAsHeader().parse( reader );
for( CSVRecord record : records ) {
int station_id = Integer.parseInt( record.get( "station_id" ) );
int commodity_id = Integer.parseInt( record.get( "commodity_id" ) );
long supply = Long.parseLong( record.get( "supply" ) );
int buy_price = Integer.parseInt( record.get( "buy_price" ) );
int sell_price = Integer.parseInt( record.get( "sell_price" ) );
long demand = Long.parseLong( record.get( "demand" ) );
long collected_at = Long.parseLong( record.get( "collected_at" ) );
CommodityPrice commodityPrice = new CommodityPrice(station_id, commodity_id, supply, buy_price, sell_price, demand, collected_at);
commodityPrices.add( commodityPrice );
if (commodityPrices.size() == 1000){
commodityPriceRepository.saveAll( commodityPrices );
commodityPriceRepository.flush();
entityManager.clear();
commodityPrices.clear();
System.out.println(lineCount);
}
lineCount ++;
}
}
catch( IOException e ) {
log.error( e.getLocalizedMessage() );
}
commodityPriceRepository.saveAll( commodityPrices );
stopWatch.stop();
log.info( "Successfully inserted " + lineCount + " lines in " + stopWatch.getTotalTimeSeconds() + " seconds." );
}
}
我的应用程序.properties
# HIBERNATE
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgreSQLDialect
spring.jpa.properties.hibernate.jdbc.lob.non_contextual_creation=true
spring.jpa.hibernate.ddl-auto = update
spring.jpa.properties.hibernate.jdbc.batch_size=1000
spring.jpa.properties.hibernate.order_inserts=true
【问题讨论】:
-
如果您已经在 application.properties 中设置了 spring 批量大小,则不需要商品Prices.size() == 1000。当你调用 saveAll() 时,spring batch 会根据 batch 自动持久化。
标签: java spring postgresql spring-data-jpa bulkinsert