【发布时间】:2014-11-27 09:23:41
【问题描述】:
我编写了在 csv 文件之间进行比较的脚本。 但我还是有问题 我需要它永远是
5 个值 - 空格 - 5 个值
问题是有些行只包含4个值,所以我需要添加而不是缺失值空间列
输入:
文件1:
1,1,1,1
3,3,3,3,3
文件2:
2,2,2,2
4,4,4,4,4
现在结果如下所示:
1,1,1,1, ,2,2,2,2
3,3,3,3,3, ,4,4,4,4,4
我需要这样的结果:
1,1,1,1, , , 2,2,2,2,*space*
3,3,3,3,3, ,4,4,4,4,4
这是我的代码:
#! /bin/bash
#------------------------------------------------------------------------------
#
# Description: Joins the files vartically based on the file extensions.
#
# Usage : ./joinfile directory1 directory2
#
#------------------------------------------------------------------------------
#---- Variables ---------------------------------------------------------------
resultfile="resultfile.csv"
#---- Main --------------------------------------------------------------------
# Checking if two arguments are provided, if not, display usage info, and exit.
if [ "$#" -ne 2 ]
then
echo "Usage: $0 directory1 directory2"
exit 1
fi
# Checking if any of the arguments provided is not a directory.
if [ ! -d "$1" -o ! -d "$2" ]
then
if [ ! -d "$1" ]
then
echo "Error: $1 is not a valid directory"
fi
if [ ! -d "$2" ]
then
echo "Error: $2 is not a valid directory"
fi
exit 1
fi
# Removing the end slash from the arguments, if user had provided.
dir1=$(echo "$1" | sed 's/\/$//')
dir2=$(echo "$2" | sed 's/\/$//')
# Creating an array of files having ^ in their filenames.
filearr=( $(ls "$dir1"/*^* "$dir2"/*^*) )
# Getting filearr length.
filearrlen=${#filearr[@]}
# Creating an array of extensions.
for (( i=0; i<"$filearrlen"; i++ ))
do
extarr+=(${filearr[i]##*^})
done
# Removing duplicates and the last extension from an extarr.
OLDIFS="$IFS"
IFS=$'\n'
newextarr=($(for i in "${extarr[@]}"; do echo "$i" | sed 's/\.[^.]*$//'; done | sort -du))
IFS="$OLDIFS"
# Getting newextarr length.
newextarrlen=${#newextarr[@]}
# Removing the previous outfile, if exists.
if [ -e "$resultfile" ]
then
rm "$resultfile"
fi
# Joning the files vertically based on the extensions.
for (( i=0; i<"$newextarrlen"; i++ ))
do
ext="${newextarr[i]}"
echo "Handling ==> $ext"
# Getting files with similar extensions.
joinfiles=($(for j in "${filearr[@]}"; do echo "$j" | grep "\^$ext"; done))
# Getting joinfiles array length.
joinfileslen=${#joinfiles[@]}
# Making a list of files to be pasted.
for (( k=0; k<"$joinfileslen"; k++))
do
pastefiles+="${joinfiles[k]} "
dos2unix "${joinfiles[k]}" 2>/dev/null
cat "${joinfiles[k]}" | grep "^[ \t]*([0-9]* [0-9]*)," | sed 's/^[ \t]*//g' | sort -t, - k1 | cut -d',' -f1- >.ext_${k}_tags.csv
done
# Executing paste command.
echo "==> ${ext}" >> "$resultfile"
awk 'BEGIN{ FS = "," }
{
if(FNR == NR){ a[$1] = $0 } else{ b[$1] = $0 }
for(i in a) {
if (i in b)
{ c[i]=a[i]", ,"b[i]; if (a[i] == b[i] ) { c[i]="True,"c[i]; } else { c[i]="False,"c[i]; }
} else { c[i]="False,"a[i]", ,"i",MISSING-MISSING-MISSING";}
}
for(i in b) {
if (! i in a) { c[i]="False,"i",MISSING-MISSING-MISSING, ,"b[i]; }
}
}
END{
for (i in c){ print c[i]; }
}
' ".ext_0_tags.csv" ".ext_1_tags.csv"|sort -t, -k1 >> "$resultfile"
rm -f ".ext_0_tags.csv" ".ext_1_tags.csv"
done
#---- End ---------------------------------------------------------------------
【问题讨论】:
-
请仅发布代码的相关部分以提供最小示例。
-
@TomFenech,好的,我认为它更清楚
-
您需要编辑您的问题并删除所有与问题无关的部分。