【发布时间】:2015-10-23 12:24:31
【问题描述】:
我有一个无法解析的 CSV 文件。我正在使用opencsv 库。这是我的数据的样子以及我想要实现的目标。
RPT_PE,CLASS,RPT_MKT,PROV_CTRCT,CENTER_NM,GK_TY,MBR_NM,MBR_PID "20150801","NULL","33612","00083249P PCP602","JOE SMITH ARNP","NULL","FRANK, LUCAS E","50004655200"
我遇到的问题是成员名称 ("FRANK, LUCAS E") 被分成两列,并且成员名称应该是一列。我再次使用 opencsv 和逗号作为分隔符。有什么办法可以忽略双引号内的逗号?
public void loadCSV(String csvFile, String tableName,
boolean truncateBeforeLoad) throws Exception {
CSVReader csvReader = null;
if (null == this.connection) {
throw new Exception("Not a valid connection.");
}
try {
csvReader = new CSVReader(new FileReader(csvFile), this.seprator);
} catch (Exception e) {
e.printStackTrace();
throw new Exception("Error occured while executing file. "
+ e.getMessage());
}
String[] headerRow = csvReader.readNext();
if (null == headerRow) {
throw new FileNotFoundException(
"No columns defined in given CSV file."
+ "Please check the CSV file format.");
}
String questionmarks = StringUtils.repeat("?,", headerRow.length);
questionmarks = (String) questionmarks.subSequence(0, questionmarks
.length() - 1);
String query = SQL_INSERT.replaceFirst(TABLE_REGEX, tableName);
System.out.println("Base Query: " + query);
String headerRowMod = Arrays.toString(headerRow).replaceAll(", ]", "]");
String[] strArray = headerRowMod.split(",");
query = query
.replaceFirst(KEYS_REGEX, StringUtils.join(strArray, ","));
System.out.println("Add Headers: " + query);
query = query.replaceFirst(VALUES_REGEX, questionmarks);
System.out.println("Add questionmarks: " + query);
String[] nextLine;
Connection con = null;
PreparedStatement ps = null;
try {
con = this.connection;
con.setAutoCommit(false);
ps = con.prepareStatement(query);
if (truncateBeforeLoad) {
//delete data from table before loading csv
con.createStatement().execute("DELETE FROM " + tableName);
}
final int batchSize = 1000;
int count = 0;
Date date = null;
while ((nextLine = csvReader.readNext()) != null) {
System.out.println("Next Line: " + Arrays.toString(nextLine));
if (null != nextLine) {
int index = 1;
for (String string : nextLine) {
date = DateUtil.convertToDate(string);
if (null != date) {
ps.setDate(index++, new java.sql.Date(date
.getTime()));
} else {
ps.setString(index++, string);
}
}
ps.addBatch();
}
if (++count % batchSize == 0) {
ps.executeBatch();
}
}
ps.executeBatch(); // insert remaining records
con.commit();
} catch (SQLException | IOException e) {
con.rollback();
e.printStackTrace();
throw new Exception(
"Error occured while loading data from file to database."
+ e.getMessage());
} finally {
if (null != ps) {
ps.close();
}
if (null != con) {
con.close();
}
csvReader.close();
}
}
public char getSeprator() {
return seprator;
}
public void setSeprator(char seprator) {
this.seprator = seprator;
}
public char getQuoteChar() {
return quoteChar;
}
public void setQuoteChar(char quoteChar) {
this.quoteChar = quoteChar;
}
}
【问题讨论】:
-
根据 CSVReader 文档,应该处理这种情况。发布代码的关键部分。
-
查看我的代码示例。
-
我写了一个简单的程序,它似乎对我有用。而不是额外的列,我得到了带有逗号的全名,这是预期的。你可以试试,它可能会提供一些线索
标签: java csv delimiter opencsv