【问题标题】:shell script to find first and last occourance of a string用于查找字符串的第一次和最后一次出现的 shell 脚本
【发布时间】:2016-08-03 19:00:17
【问题描述】:

我已经准备了一个 shell 脚本来在一个 50 节点的 hadoop 集群中执行以下操作:

  • 列出每台服务器中与我的应用程序相关的所有日志文件
  • 打印最后修改的时间戳、主机名、文件名
  • 根据修改后的时间戳对 50 个节点的日志文件进行排序

当前输出格式为:

2016-07-11-01:06 server1 MY_APPLICATION-worker-6701.log.6.gz
2016-07-12-05:23 server1 MY_APPLICATION-worker-6701.log.7.gz
2016-07-13-08:38 server2 MY_APPLICATION-worker-6701.log
2016-07-13-10:38 server3 MY_APPLICATION-worker-6701.log.out
2016-07-13-10:38 server2 MY_APPLICATION-worker-6701.log.err
2016-07-13-10:38 server5 MY_APPLICATION-worker-6701.log
2016-07-15-10:22 server4 MY_APPLICATION-worker-6703.log.out
2016-07-15-10:22 server3 MY_APPLICATION-worker-6703.log.err
2016-07-15-10:22 server2 MY_APPLICATION-worker-6703.log

.

totallogs=""
for server in $(cat all-hadoop-cluster-servers.txt); do
    logs1="$(ssh user_id@$server 'ls /var/log/hadoop/storm/ -ltr --time-style="+%Y-%m-%d-%H:%M" | grep MY_APPLICATION | awk  -v host=$HOSTNAME "{print \$6, host, \$7}"' )"
    if [ -z "${logs1}"  ]; then
        continue
    else
        logs1+="\n"
        totallogs+=$logs1
    fi  
done
for el in "${totallogs[@]}"
do
    printf "$el"
done | sort

如何查找每个日志文件中“unique-ID”的第一次出现和“unique-ID”的最后一次出现以及上述输出。

预期的输出格式是:

time_stamp 主机名文件名 first-unique-ID last-unique-ID

2016-07-11-01:06 server1 MY_APPLICATION-worker-6701.log.6.gz    1467005065878   1467105065877
2016-07-12-05:23 server1 MY_APPLICATION-worker-6701.log.7.gz    1467105065878   1467205065860
2016-07-13-08:38 server2 MY_APPLICATION-worker-6701.log         1467205065861   1467305065852
2016-07-13-10:38 server3 MY_APPLICATION-worker-6701.log.out     
2016-07-13-10:38 server2 MY_APPLICATION-worker-6701.log.err     
2016-07-13-10:38 server5 MY_APPLICATION-worker-6701.log         1467305065853   1467405065844
2016-07-15-10:22 server4 MY_APPLICATION-worker-6703.log.out     
2016-07-15-10:22 server3 MY_APPLICATION-worker-6703.log.err     
2016-07-15-10:22 server2 MY_APPLICATION-worker-6703.log         1467405065845   1467505065853

示例日志文件:

DEBUG | 2008-09-06 10:51:44,848 | unique-ID >>>>>> 1467205065861
DEBUG | 2008-09-06 10:51:44,817 | DefaultBeanDefinitionDocumentReader.java | 86 | Loading bean definitions
DEBUG | 2008-09-06 10:51:44,848 | AbstractBeanDefinitionReader.java | 185 | Loaded 5 bean definitions from location pattern [samContext.xml]
INFO | 2008-09-06 10:51:44,848 | XmlBeanDefinitionReader.java | 323 | Loading XML bean definitions from class path resource [tmfContext.xml]
DEBUG | 2008-09-06 10:51:44,848 | DefaultDocumentLoader.java | 72 | Using JAXP provider [com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl]
DEBUG | 2008-09-06 10:51:44,848 | BeansDtdResolver.java | 72 | Found beans DTD [http://www.springframework.org/dtd/spring-beans.dtd] in classpath: spring-beans.dtd
DEBUG | 2008-09-06 10:51:44,848 | unique-ID >>>>>> 1467205065862
DEBUG | 2008-09-06 10:51:44,864 | DefaultBeanDefinitionDocumentReader.java | 86 | Loading bean definitions
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'MS-SQL'
DEBUG | 2008-09-06 10:51:45,458 | DefaultSingletonBeanRegistry.java | 213 | Creating shared instance of singleton bean 'MySQL'
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 383 | Creating instance of bean 'MySQL'
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 459 | Eagerly caching bean 'MySQL' to allow for resolving potential circular references
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'MySQL'
DEBUG | 2008-09-06 10:51:45,458 | DefaultSingletonBeanRegistry.java | 213 | Creating shared instance of singleton bean 'Oracle'
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 383 | Creating instance of bean 'Oracle'
DEBUG | 2008-09-06 10:51:45,458 | AbstractAutowireCapableBeanFactory.java | 459 | Eagerly caching bean 'Oracle' to allow for resolving potential circular references
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'Oracle'
DEBUG | 2008-09-06 10:51:45,473 | DefaultSingletonBeanRegistry.java | 213 | Creating shared instance of singleton bean 'PostgreSQL'
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 383 | Creating instance of bean 'PostgreSQL'
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 459 | Eagerly caching bean 'PostgreSQL' to allow for resolving potential circular references
DEBUG | 2008-09-06 10:51:45,473 | AbstractAutowireCapableBeanFactory.java | 411 | Finished creating instance of bean 'PostgreSQL'
INFO | 2008-09-06 10:51:45,473 | SQLErrorCodesFactory.java | 128 | SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase]
DEBUG | 2008-09-06 10:52:44,817 | DefaultBeanDefinitionDocumentReader.java | 86 | Loading bean definitions
DEBUG | 2008-09-06 10:52:44,848 | unique-ID >>>>>> 1467205065864

【问题讨论】:

  • 您应该向我们展示您输入的代表性样本以及相应的所需输出。请注意,您永远不应该使用for 读取行(改用while read -r)。
  • @anubhava,这是一个模拟日志文件。每隔几条语句就会在日志文件中出现文本“unique-ID >>>>>>”。文本“unique-ID >>>>>>”旁边的值是预期输出中提到的唯一ID。
  • 是的,但是为了构建一个解决方案,我们需要有一个可以从给定的示例输入生成的预期输出。例如。输出中的1467305065852 甚至不存在于样本输入中。

标签: shell hadoop


【解决方案1】:
grep 'uniqueID' sample_log_file | sed -n '1p;$p'

【讨论】:

    【解决方案2】:

    由于您已经在使用awk,您可以更改您的awk 程序

    "{print \$6, host, \$7}"
    

    "{ first=last=\"\"; path=\"/var/log/hadoop/storm/\"\$7; while (getline var <path) if (split(var, arr, \">>>>>>\") > 1) { if (!first) first=arr[2]; last=arr[2] } print \$6, host, \$7, \"\t\", first, last }"
    

    为了让它完成工作。

    【讨论】:

      猜你喜欢
      • 2015-06-08
      • 1970-01-01
      • 1970-01-01
      • 2022-01-13
      • 1970-01-01
      • 1970-01-01
      • 2011-03-09
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多