【问题标题】:How to get tv show episode and session number from title如何从标题中获取电视节目集和季号
【发布时间】:2021-12-24 03:06:51
【问题描述】:

我正在尝试从标题中获取 tvhsow 季节和剧集编号

我尝试使用以下代码,但它也将像 xxxxe 3 这样的标题选为第 3 集

$episode = $title | Select-String -Pattern "E(\d+)", "E (\d+)", "Episode (\d+)" | % {$_.Matches.Groups[1].Value}
$season = $title | Select-String -Pattern "S(\d+)", "S (\d+)", "Season (\d+)" | % {$_.Matches.Groups[1].Value} 

我如何确保我可以从这些格式中选择季号和剧集。

  • xxx S01E01
  • xxxe 1 S01E01
  • xxx S01 E01
  • xxx 01x01
  • xxx 季 01 集 01

如果剧名中没有上述季节或剧集编号,那么我只想不返回任何内容,例如如果节目被命名为“xxxxxE 1”

【问题讨论】:

  • 添加新要求问题得到回答后被认为是一个坏习惯。由于您已经收到了两个答案,因此您开始添加问题中(并且仍然)提到的新文件名示例,因此没有人可以创建正则表达式来处理这些问题。您希望在这里得到什么,一个 Universal 正则表达式,它将解析出您曾经扔给它的所有内容?

标签: powershell powershell-4.0


【解决方案1】:

假设如下

  1. 季节和剧集将始终为 2 个(或更多)数字
  2. 季节和剧集将始终位于文件名的末尾。

我建议使用正则表达式模式锚定到名称的末尾。从那里我们在句点(文件扩展名)之前考虑 0 个或更多字符,1 个文字句点(用于扩展名),在句点和剧集之间有 0 个或更多字符,在季节和剧集之间考虑 0 或更多字符。

$examples = @'
xxx S01E02.avi
xxxe 1 S02E03.mp3
xxx S03 E04.mov
xxx 04x05.png
xxx Season 05 Episode 06.wav
'@ -split [environment]::NewLine

$examples | ForEach-Object {
    if($_ -match '.+(\d{2,}).*(\d{2,}).*\..*$'){
        "Season: {0}  Episode: {1}" -f $matches.1,$matches.2
    }
}

这将输出

Season: 01  Episode: 02
Season: 02  Episode: 03
Season: 03  Episode: 04
Season: 04  Episode: 05
Season: 05  Episode: 06

您没有展示如何填充 $title,因此假定它只是一个字符串。但是,如果您想应用于文件对象,您有几个选择。

我们可以不使用正则表达式模式并使用 Name 属性。

$videolist = Get-Childitem -Path path\to\movies -Filter *.whatever

foreach($video in $videolist){
    if($video.Name -match '.+(\d{2,}).*(\d{2,}).*\..*$'){
        "Season: {0}  Episode: {1}" -f $matches.1,$matches.2
    }
}

我们可以使用 BaseName 属性并稍微调整正则表达式。

$videolist = Get-Childitem -Path path\to\movies -Filter *.whatever

foreach($video in $videolist){
    if($video.BaseName -match '.+(\d{2,}).*(\d{2,}).*$'){
        "Season: {0}  Episode: {1}" -f $matches.1,$matches.2
    }
}

【讨论】:

  • 我修改了你的代码,但没有用。你能检查一下并告诉我我做错了什么 $examples = 'xxx S01 E01' if($examples -match '.+(\d{2,}).*(\d{2,}).* \..*$'){ $season = $matches.1 $episode = $matches.2 }
  • 您的示例末尾没有点。所以你需要从正则表达式中删除文字点 '.+(\d{2,}).*(\d{2,}).*$'
  • 谢谢你的工作,但如果它的 S1E05 它不会选择赛季编号,而且如果标题像 xxxxxxxx S10 E06 xxxxx 01 05 它会选择最后一个数字。他们是我可以提供价值的任何方式吗? S01 E01,S01E01,S1E05,01x05,第 1 季第 5 集
【解决方案2】:

您可以构造一个正则表达式字符串来解析季节和剧集编号,如下所示:

$examples = 'xxx S01E01','xxxe 1 S01E03','xxx S06 E01','xxx 01x01','xxx Season 01 Episode 02'

foreach ($title in $examples) {
    if ($title -match '(?:(?:S(?:eason)?)?\s*(\d+)[\sx]*)(?:(?:E(?:pisode)?)?\s*(\d+))') {
        $season  = [int]$matches[1]
        $episode = [int]$matches[2]

        # just to display the output:
        [PsCustomObject]@{
            Title   = $title
            Season  = $season
            Episode = $episode
        }
    }
}

输出:

Title                    Season Episode
-----                    ------ -------
xxx S01E01                    1       1
xxxe 1 S01E03                 1       3
xxx S06 E01                   6       1
xxx 01x01                     1       1
xxx Season 01 Episode 02      1       2

正则表达式详细信息:

(?:                # Match the regular expression below
   (?:             # Match the regular expression below
      S            # Match the character “S” literally
      (?:          # Match the regular expression below
         eason     # Match the characters “eason” literally
      )?           # Between zero and one times, as many times as possible, giving back as needed (greedy)
   )?              # Between zero and one times, as many times as possible, giving back as needed (greedy)
   \s              # Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
      *            # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
   (               # Match the regular expression below and capture its match into backreference number 1
      \d           # Match a single digit 0..9
         +         # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   )
   [\sx]           # Match a single character present in the list below
                   # A whitespace character (spaces, tabs, line breaks, etc.)
                   # The character “x”
      ?            # Between zero and one times, as many times as possible, giving back as needed (greedy)
)
(?:                # Match the regular expression below
   (?:             # Match the regular expression below
      E            # Match the character “E” literally
      (?:          # Match the regular expression below
         pisode    # Match the characters “pisode” literally
      )?           # Between zero and one times, as many times as possible, giving back as needed (greedy)
   )?              # Between zero and one times, as many times as possible, giving back as needed (greedy)
   \s              # Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
      *            # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
   (               # Match the regular expression below and capture its match into backreference number 2
      \d           # Match a single digit 0..9
         +         # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   )
)

我已经更改了您的一些示例,以便更清楚地显示正确找到的数字

【讨论】:

  • 感谢您的代码,但它的作用与 Title Season Episode ----- ------ ------- xxxE 01 0 1
  • 如果标题有类似 xxxxE01 或 xxxE 01 的内容,它仍然是它的剧集编号。如果它采用以下格式,我需要确保它只注册季号或剧集。 (见我的帖子)
  • @maj 你应该解释一下e 附加到xxx 的含义,因为示例后面还有明确的季节和剧集代码,
  • xxxe 01 是节目的名称,而不是剧集编号,季节或剧集编号将按此顺序排列。 S01 E01, S01E01, 01X01, Season 01 Episode 01
  • 例如,我有一个名为 E2 Part 1 的电视节目,没有任何其他内容,因此脚本不应返回任何内容,因为它与以下格式不匹配 S01 E01、S01E01、01X01、Season 01 Episode 01
猜你喜欢
  • 2020-08-13
  • 1970-01-01
  • 2014-05-05
  • 2012-05-28
  • 1970-01-01
  • 2013-04-12
  • 1970-01-01
  • 2020-12-28
  • 2013-06-01
相关资源
最近更新 更多