【问题标题】:Parsing a .txt file and writing to excel in python解析 .txt 文件并在 python 中写入 excel
【发布时间】:2020-12-08 10:28:13
【问题描述】:

我在示例中有一个 .txt 文件,如下所示

 Name:      XYZ

Date of Service:    12/27/2018

Speaker ID:         10101
Visit Start:        06/07/2018
Visit End:          06/18/2018
Recipient:      
CHIEF COMPLAINT:
Liver discomfort.

我想解析.txt文件并写入excel

Name  Date of Service  Speaker ID ....
XYZ   12/27/2018       10101

我写了以下代码

import xlwt
import xlrd

textfiles = Input.txt

for a in textfiles:
    f = open(textfile, 'r+')
    row_list = []
    for row in f:
        row_list.append(row.split(':'))
    column_list = zip(*row_list)
    workbook = xlwt.Workbook()
    worksheet = workbook.add_sheet('Sheet1')
    i = 0 
    for column in column_list:
        for item in range(len(column)):
            worksheet.write(item, i, column[item])
        i+=1
    workbook.save(textfile + '.xls')

但我得到的结果是用 excel 写在一个列中

Name
XYZ
Speaker ID
10101
...

CHIEF COMPLAINT
Liver discomfort

即使在 Pandas 中,我也尝试过它给了我相同的输出。有人可以帮助我如何在行中编写标题并将其对应的数据写在 col 中。

谢谢 米拉。

完整的输入文件

Patient Name:       XYZ

Date of Service:    12/27/2018

Speaker ID:     10101
Visit Start:        06/07/2018
Visit End:      06/18/2018
Recipient:      
CHIEF COMPLAINT:
Chest discomfort.

HISTORY OF PRESENT ILLNESS:
The patient is a 64-year-old Caucasian female with a past medical history which is remarkable for severe COPD as well as severe coronary artery disease.  She has a complex cardiac history including a coronary anomaly and coronary fistula with coronary bypass surgery with a left internal mammary artery graft to left anterior descending coronary artery in 2003.  Subsequently in 2007, for symptoms of a substernal chest heaviness, she was evaluated and wound up having a right coronary stent placed.  Her most recent angiography in 2012 demonstrated a patent left internal mammary artery to the LAD and patent other stents including the right coronary stent noted above.  She had sternal wires removed in 2012 because of sharp substernal chest discomfort.  She is a former smoker who quit many years ago prior to her coronary bypass graft surgery, but unfortunately has gone on to still have significant COPD.  Over the past several days to few weeks, she has had increasing amounts of shortness of breath, increasing amounts of substernal heaviness, mild to moderate in nature, coming on with exertion and going away with rest, not dissimilar from presentations in the past with coronary disease.  It should also note that she brought up the fact that she has "passed many stress tests with flying colors" and then went on to have coronary blockages.  She has been hospitalized now with an apparent exacerbation of COPD and this has not been completely cleared.  She denies any recent fevers or chills.  There has been no nausea, no vomiting, no abdominal pain, no focal neurologic complaints, no abnormal bleeding.

REVIEW OF SYSTEMS:
Complete review of systems in detail is as noted above, pertinent positives and negatives are as noted above, all other systems were reviewed and were negative.

PAST MEDICAL HISTORY:
Also remarkable for lifelong asthma with COPD, hypothyroidism, hypertension, and gastroesophageal reflux disease.

ALLERGIES:
Include,
1.  SULFA.
2.  PENICILLIN.
3.  CEPHALOSPORINS.

SOCIAL HISTORY:
As noted above.

FAMILY HISTORY:
Positive for "lots of heart disease."

PHYSICAL EXAMINATION:
VITAL SIGNS:  Remarkable most recently for a blood pressure of 128/60, a respiratory rate of 16, a pulse of 81, and a temperature of 36.7 degrees Centigrade.  Her saturation currently is 96% on room air.
HEENT:  Negative.
NECK:  Her jugular venous pressure is not elevated.  Her carotids are 2+ bilateral.  No bruits are heard.
LUNGS:  Have diffuse wheezing throughout both end-expiratory as well as inspiratory.  A few scattered rhonchi are noted bilaterally.  Mild kyphosis of her spine is noted.
CARDIOVASCULAR:  Cardiac auscultation demonstrates regular rate and rhythm, normal S1, normal S2.  No murmurs, gallops, or rubs are appreciated.
GASTROINTESTINAL:  Soft, nontender, with normal active bowel sounds.
EXTREMITIES:  Demonstrate no cyanosis, clubbing or edema.
NEUROLOGIC:  She is nonfocal and is able to move all extremities.
PSYCHIATRIC:  She is awake, alert and oriented x3 and cooperative.
SKIN:  Warm and dry.

DIAGNOSTIC DATA:
Her electrocardiogram shows sinus rhythm, incomplete right bundle-branch block, nonspecific anterior T-wave changes, and this is largely unchanged compared to previous tracings.

Her last echocardiogram, which was in 2006, demonstrated normal LV systolic function and her right atrium and right ventricle appeared at that time to be normal.

Her cardiac catheterization in 01/2012 showed the aforementioned patent LIMA bypass to the LAD.  There was a patent LAD stent.  There was a patent stent to the right posterior descending coronary artery.  There was an anomalous left coronary artery arising from the right coronary cusp, normal LV systolic function and mild mitral regurgitation.

Her chest x-ray directly reviewed by me, as was her previous EKG, showed clearcut changes associated with COPD, hyperinflation, no infiltrates.

IMPRESSION:
The patient had symptoms of chest tightness and heaviness when she presented with her catheterization for 2012 and at that time, all of her major coronary arteries were patent.  Her course here has been predominantly pulmonary.  Her laboratories include a BNP which is only 40.  Her one cardiac troponin was 0.02.  Her symptoms are unlikely to be coronary.  She had a nuclear stress test, which by her report a year ago this summer, appeared to be "normal."  We will check her echocardiogram for any interval change in LV systolic function, in particular for regional wall motion abnormalities.  If there are no regional wall motion abnormalities, I think she will be able to be discharged from a cardiac perspective, although her lungs are clearly still with significant finding of wheezing.  With regard to her coronary status, she is on atorvastatin 20 per day and this should be continued.  Blood pressure on steroid taper will need to be watched closely.

PLAN:
No Symptoms

根据文件,计划将是最后一个标题。

【问题讨论】:

    标签: python excel pandas parsing header


    【解决方案1】:

    试试这个:

    text = open('yourtextfile.txt').read()
    #OR
    text = ''' Name:      XYZ
           Date of Service:    12/27/2018
           Speaker ID:         10101
           Visit Start:        06/07/2018
           Visit End:          06/18/2018
           Recipient:
           CHIEF COMPLAINT:
           Liver discomfort.'''
    
    rows = [x for x in text.split('\n') if x]
    for n in range(len(rows)):
        if 'CHIEF COMPLAINT' in rows[n]:
            rows[n] = ''.join(rows[n:])
            del rows[n+1:]
            break
    print(rows)
    [' Name:      XYZ',
     'Date of Service:    12/27/2018',
     'Speaker ID:         10101',
     'Visit Start:        06/07/2018',
     'Visit End:          06/18/2018',
     'Recipient:      ',
     'CHIEF COMPLAINT:Liver discomfort.']
    
    df = pd.DataFrame([{row.split(':')[0].strip(): row.split(':')[1].strip() for row in rows}])
    print(df)
      Name Date of Service Speaker ID Visit Start   Visit End Recipient    CHIEF COMPLAINT
    0  XYZ      12/27/2018      10101  06/07/2018  06/18/2018            Liver discomfort.
    df.to_csv('output.csv', index=False)
    

    【讨论】:

    • 感谢您的回复。如果我像你提到的那样给出 text = string ,它就可以工作。但我的输入是一个 .txt 文件。我需要阅读文件并进行迭代。当我给出'text =(“C:/Users/Gayathri/Prathap/Test/1.txt”)'时,我得到的结果与输出excel中的一样
    • text = open('yourtextfile.txt').read()替换该行,如果对您有帮助,请将答案标记为正确答案。
    • 还有一个疑问,您在 If CHIEF COMPLAINT in rows[n] 中提到过。但在那之后我有很多行。我只是提供了一个示例文件。如果我尝试获取 IndexError: list index out of range when last line of code was executed "df = pd.DataFrame([{row.split(':')[0].strip(): row.split(': ')[1].strip() for row in rows}])" 。我试图获取整个文件的最大长度并分配 .仍然得到同样的错误。你能帮忙吗?
    • 基于示例输入,我假设 CHIEF COMPLAINT: 是文件中的最后一个键,Liver discomfort. 是它的值,后面没有任何其他行。所有行在同一行中都有键/值,如果它们用新行分隔,则需要在代码中更改逻辑。如果你有更多的行,你能分享一个全文文件吗?
    • HI Idar,感谢您的所有回复。正如你所提到的,我已经编辑了帖子以添加输入文件。
    猜你喜欢
    • 2018-05-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-08-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多