【问题标题】:Convert Text to Excel将文本转换为 Excel
【发布时间】:2020-12-11 21:09:04
【问题描述】:

我有以下文本文件作为输入

Patient Name:       XXX,A

Date of Service:    12/12/2018

Speaker ID:     10531
Visit Start:    06/07/2018
Visit End:      06/18/2018
Recipient:      
REQUESTING PHYSICIAN:
Mr.XXX

REASON FOR CONSULTATION:
Acute asthma.

HISTORY OF PRESENT ILLNESS:
The patient is a 64-year-old female who is well known to our practice.  She has not been feeling well over the last 3 weeks and has been complaining of increasing shortness of breath, cough, wheezing, and chest tightness.  She was prescribed systemic steroids and Zithromax.  Her respiratory symptoms persisted; and subsequently, she went to Capital Health Emergency Room.  She presented to the office again yesterday with increasing shortness of breath, chest tightness, wheezing, and cough productive of thick sputum.  She also noted some low-grade temperature.

PAST MEDICAL HISTORY:
Remarkable for bronchial asthma, peptic ulcer disease, hyperlipidemia, coronary artery disease with anomalous coronary artery, status post tonsillectomy, appendectomy, sinus surgery, and status post rotator cuff surgery.

HOME MEDICATIONS:
Include;
1.  Armodafinil.
2.  Atorvastatin.
3.  Bisoprolol.
4.  Symbicort.
5.  Prolia.
6.  Nexium.
7.  Gabapentin.
8.  Synthroid.
9.  Linzess_____.
10.  Montelukast.
11.  Domperidone.
12.  Tramadol.

ALLERGIES:
1.  CEPHALOSPORIN.
2.  PENICILLIN.
3.  SULFA.

SOCIAL HISTORY:
She is a lifelong nonsmoker.

PHYSICAL EXAMINATION:
GENERAL:  Shows a pleasant 64-year-old female.
VITAL SIGNS:  Blood pressure 108/56, pulse of 70, respiratory rate is 26, and pulse oximetry is 94% on room air.  She is afebrile.
HEENT:  Conjunctivae are pink.  Oral cavity is clear.
CHEST:  Shows increased AP diameter and decreased breath sounds with diffuse inspiratory and expiratory wheeze and prolonged expiratory phase.
CARDIOVASCULAR:  Regular rate and rhythm.
ABDOMEN:  Soft.
EXTREMITIES:  Does not show any edema.

LABORATORY DATA:
Her INR is 1.1.  Chemistry; sodium 139, potassium 3.3, chloride 106, CO2 of 25, BUN is 10, creatinine 0.74, and glucose is 110.  BNP is 40.  White count on admission 16,800; hemoglobin 12.5; and neutrophils 88%.  Two sets of blood cultures are negative.  CT scan of the chest is obtained, which is consistent with tree-in-bud opacities of the lung involving bilateral lower lobes with patchy infiltrate involving the right upper lobe.  There is mild bilateral bronchial wall thickening.

IMPRESSION:
1.  Acute asthma.
2.  Community acquired pneumonia.
3.  Probable allergic bronchopulmonary aspergillosis.
   

我希望将文本文件转换为 excel 文件

 Patient Name   Date of Service Speaker ID  Visit Start Visit End   Recipient ..... IMPRESSION:
                        
     XYZ        2/27/2018      10101       06-07-2018   06/18/2018   NA    .......   1.  Acute asthma.
                                                                                     2.  Community 
                                                                                         acquired 
                                                                                         pneumonia.
                                                                                     3.  Probable 
                                                                                          allergic     

我写了以下代码

with open('1.txt') as infile:
    registrations = []
    fields = OrderedDict()
    d = {}
    for line in infile:
        line = line.strip()
        if line:
            key, value = [s.strip() for s in line.split(':', 1)]
            d[key] = value
            fields[key] = None
        else:
            if d:
                registrations.append(d)
                d = {}
    else:
        if d:    # handle EOF
            registrations.append(d)

    with open('registrations.csv', 'w') as outfile:
    writer = DictWriter(outfile, fieldnames=fields)
    writer.writeheader()
    writer.writerows(registrations)

我遇到了一个错误

ValueError: 没有足够的值来解包(预期 2,得到 1)

我不确定错误在说什么。我搜索了网站,但找不到解决方案。我尝试编辑文件以删除空间并尝试了上面的代码,它正在工作。但在实时场景中会有数十万个文件,因此手动编辑每个文件以删除所有空格是不可能的。

【问题讨论】:

  • 什么是OrderedDict()
  • OrderedDict() 将包含所有标头值。

标签: python excel parsing text split


【解决方案1】:

您的特定错误可能来自

key, value = [s.strip() for s in line.split(':', 1)]

您的某些行没有冒号,因此您的列表中只有一个值,我们无法为配对键 value 分配一个值。

例如:

line = 'this is some text with a : colon'
key, value = [s.strip() for s in line.split(':', 1)]
print(key)
print(value)

返回:

这是一些带有a的文本

冒号

但是你会得到你的错误

line = 'this is some text without a colon'
key, value = [s.strip() for s in line.split(':', 1)]
print(key)
print(value)

【讨论】:

  • 是的,我接受。有没有办法以编程方式删除所有空格并将所有内容放在同一行,而不是在记事本中手动删除所有空格?例如 ``` 患者姓名:XXX:A 服务日期:12/12/2018 演讲者 ID:10531 访问开始:06/07/2018 访问结束:06/18/2018 收件人:请求医师:Dr. XYZ。 ........ ``` 所以输入中的所有行都有一个冒号。或者可以建议其他一些方法会有所帮助。谢谢米拉
  • @Meera 我想你可能会问这样的问题:stackoverflow.com/questions/16566268/… 但如果没有,你可以尝试发布一个新问题,这样我们就不会使这个线程复杂化。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多